public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
* [PATCH v7 0/4] devtools: add AI-assisted code review tools
       [not found] <0260109014106.398156-1-stephen@networkplumber.org>
@ 2026-01-26 18:40 ` Stephen Hemminger
  2026-01-26 18:40   ` [PATCH v7 1/4] doc: add AGENTS.md for AI-powered " Stephen Hemminger
                     ` (9 more replies)
  0 siblings, 10 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-01-26 18:40 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This series adds infrastructure for AI-assisted code review of DPDK
patches and documentation. It provides guidelines for AI tools and
scripts to interface with multiple AI providers.

The AGENTS.md file consolidates DPDK coding standards, commit message
requirements, and common review feedback into a format optimized for
AI code review tools. The accompanying scripts enable automated patch
analysis using Claude, ChatGPT, Grok, or Gemini.

Patches:
  1. AGENTS.md - Guidelines document for AI review tools
  2. analyze-patch.py - Multi-provider patch review script
  3. compare-reviews.sh - Compare reviews across providers
  4. review-doc.py - Documentation review with diff output

Changes in v7:
  - Add "Review Philosophy" section to AGENTS.md with guidance on
    confidence levels and concise feedback
  - Add "Priority Areas" section covering security, correctness,
    and architecture concerns that reviewers should focus on
  - Minor code cleanups in analyze-patch.py

Changes in v6:
  - Expanded AGENTS.md with "Unnecessary Code Patterns" section
  - Added guidance on rte_malloc() and rte_memcpy() appropriate use
  - Added symbol naming guidelines for static linking
  - Improved email sending in review-doc.py with native SMTP support

Changes in v5:
  - Added review-doc.py for documentation review
  - Added email sending capability to scripts
  - Expanded forbidden tokens documentation

Changes in v4:
  - Added compare-reviews.sh script
  - Multiple output formats (text, markdown, html, json)
  - Improved error handling

Changes in v3:
  - Added support for OpenAI, xAI Grok, and Google Gemini
  - Added prompt caching for Anthropic to reduce costs
  - Restructured AGENTS.md for better machine parsing

Changes in v2:
  - Split into separate patches
  - Added verbose mode with token usage statistics

Stephen Hemminger (4):
  doc: add AGENTS.md for AI-powered code review tools
  devtools: add multi-provider AI patch review script
  devtools: add compare-reviews.sh for multi-provider analysis
  devtools: add multi-provider AI documentation review script

 AGENTS.md                   | 1000 +++++++++++++++++++++++++++++++++++
 devtools/analyze-patch.py   |  731 +++++++++++++++++++++++++
 devtools/compare-reviews.sh |  192 +++++++
 devtools/review-doc.py      |  974 ++++++++++++++++++++++++++++++++++
 4 files changed, 2897 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 devtools/analyze-patch.py
 create mode 100644 devtools/compare-reviews.sh
 create mode 100755 devtools/review-doc.py

-- 
2.51.0


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v7 1/4] doc: add AGENTS.md for AI-powered code review tools
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
@ 2026-01-26 18:40   ` Stephen Hemminger
  2026-01-30 23:49     ` Stephen Hemminger
  2026-01-26 18:40   ` [PATCH v7 2/4] devtools: add multi-provider AI patch review script Stephen Hemminger
                     ` (8 subsequent siblings)
  9 siblings, 1 reply; 51+ messages in thread
From: Stephen Hemminger @ 2026-01-26 18:40 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a structured reference document that enables AI code review tools
to validate DPDK contributions against project standards. This document
consolidates requirements from multiple sources into a machine-readable
format optimized for automated validation workflows.

The AGENTS.md file synthesizes guidelines from:
- DPDK Contributing Code documentation (patches.rst)
- DPDK Coding Style guidelines (coding_style.rst)
- DPDK validation scripts (check-git-log.sh, checkpatches.sh)
- Linux kernel patch submission process
- SPDX License Identifier specification
- DPDK Coccinelle scripts (cocci)
- common items spotted on mailing list review

Key sections include:
- SPDX license and copyright header requirements
- Commit message format with precise limits (60 char subject,
  75 char body) and tag ordering rules
- C coding style including explicit comparison requirements
- Forbidden tokens table derived from checkpatches.sh
- API tag placement rules for experimental and internal APIs
- Patch validation checklists with severity levels

The forbidden tokens section documents restrictions on deprecated
atomics, logging functions, threading APIs, and compiler built-ins
that are checked by the existing checkpatches.sh infrastructure.

Severity levels (error/warning/info) align with the exit codes and
messaging from check-git-log.sh and checkpatches.sh to help automated
tools prioritize feedback appropriately.

References:
- https://doc.dpdk.org/guides/contributing/patches.html
- https://doc.dpdk.org/guides/contributing/coding_style.html
- devtools/check-git-log.sh
- devtools/checkpatches.sh
- devtools/cocci/

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 AGENTS.md | 1000 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1000 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..74d89483f6
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,1000 @@
+# AGENTS.md - DPDK Code Review Guidelines for AI Tools
+
+This document provides guidelines for AI-powered code review tools
+when reviewing contributions to the Data Plane Development Kit
+(DPDK). It is derived from the official DPDK contributor guidelines
+and validation scripts.
+
+## Overview
+
+DPDK follows a development process modeled on the Linux Kernel. All
+patches are reviewed publicly on the mailing list before being
+merged. AI review tools should verify compliance with the standards
+outlined below.
+
+## Review Philosophy
+- Only comment when you have HIGH CONFIDENCE (>80%) that an issue exists
+- Be concise: one sentence per comment when possible
+- Focus on actionable feedback, not observations
+- When reviewing text, only comment on clarity issues if the text is genuinely
+  confusing or could lead to errors.
+
+## Priority Areas (Review These)
+
+### Security & Safety
+- Unsafe code blocks without justification
+- Command injection risks (shell commands, user input)
+- Path traversal vulnerabilities
+- Credential exposure or hard coded secrets
+- Missing input validation on external data
+- Improper error handling that could leak sensitive info
+
+### Correctness Issues
+- Logic errors that could cause panics or incorrect behavior
+- Buffer overflows
+- Race conditions
+- Resource leaks (files, connections, memory)
+- Off-by-one errors or boundary conditions
+- Incorrect error propagation
+- Changes to API without release notes
+- Changes to ABI on non-LTS release
+- Overly defensive code that adds unnecessary checks
+- Unnecessary comments that just restate what the code already shows (remove them)
+
+### Architecture & Patterns
+- Code that violates existing patterns in the code base
+- Missing error handling
+- Code that is not safe against signals
+
+---
+
+## Source License Requirements
+
+### SPDX License Identifiers
+
+Every source file must begin with an SPDX license identifier, followed
+by the copyright notice, then a blank line before other content.
+
+- SPDX tag on first line (or second line for `#!` scripts)
+- Copyright line immediately follows
+- Blank line after copyright before any code/includes
+- Core libraries and drivers use `BSD-3-Clause`
+- Kernel components use `GPL-2.0`
+- Dual-licensed code uses: `(BSD-3-Clause OR GPL-2.0)`
+
+```c
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 ExampleCorp
+ */
+
+#include <stdio.h>
+```
+
+For scripts:
+```python
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 ExampleCorp
+
+import sys
+```
+
+**Do not include boilerplate license text** - the SPDX identifier is sufficient.
+
+---
+
+## Commit Message Requirements
+
+### Subject Line (First Line)
+
+| Rule | Limit |
+|------|-------|
+| Maximum length | **60 characters** |
+| Format | `component: lowercase description` |
+| Case | Lowercase except acronyms |
+| Mood | Imperative (instructions to codebase) |
+| Punctuation | **No trailing period** |
+
+```
+# Good examples
+net/ixgbe: fix offload config option name
+config: increase max queues per port
+net/mlx5: add support for flow counters
+app/testpmd: fix memory leak in flow create
+
+# Bad examples
+Fixed the offload config option.    # past tense, has period, no prefix
+net/ixgbe: Fix Offload Config       # uppercase after colon
+ixgbe: fix something                # wrong prefix, should be net/ixgbe
+lib/ethdev: add new feature         # wrong prefix, should be ethdev:
+```
+
+#### Headline Format Errors (from check-git-log.sh)
+
+The following are flagged as errors:
+- Tab characters in subject
+- Leading or trailing spaces
+- Trailing period (`.`)
+- Punctuation marks: `, ; ! ? & |`
+- Underscores after the colon (indicates code in subject)
+- Missing colon separator
+- No space after colon
+- Space before colon
+
+#### Common Prefix Mistakes
+
+| Wrong | Correct |
+|-------|---------|
+| `ixgbe:` | `net/ixgbe:` |
+| `lib/ethdev:` | `ethdev:` |
+| `example:` | `examples/foo:` |
+| `apps/` | `app/name:` |
+| `app/test:` | `test:` |
+| `testpmd:` | `app/testpmd:` |
+| `test-pmd:` | `app/testpmd:` |
+| `bond:` | `net/bonding:` |
+
+#### Case-Sensitive Terms
+
+These terms must use exact capitalization (from `devtools/words-case.txt`):
+- `Rx`, `Tx` (not `RX`, `TX`, `rx`, `tx`)
+- `VF`, `PF` (not `vf`, `pf`)
+- `MAC`, `VLAN`, `RSS`, `API`
+- `Linux`, `Windows`, `FreeBSD`
+- Check `devtools/words-case.txt` for complete list
+
+### Commit Body
+
+| Rule | Limit |
+|------|-------|
+| Line wrap | **75 characters** |
+| Exception | `Fixes:` lines may exceed 75 chars |
+
+Body guidelines:
+- Describe the issue being fixed or feature being added
+- Provide enough context for reviewers
+- **Do not start the commit message body with "It"**
+- **Must end with** `Signed-off-by:` line (real name, not alias)
+
+### Fixes Tag
+
+When fixing regressions, use the `Fixes:` tag with a 12-character abbreviated SHA:
+
+```
+Fixes: abcdefgh1234 ("original commit subject")
+```
+
+The hash must reference a commit in the current branch, and the subject must match exactly.
+
+**Finding maintainers**: Use `devtools/get-maintainer.sh` to identify the current subsystem maintainer from the `MAINTAINERS` file, rather than CC'ing the original author:
+
+```bash
+git send-email --to-cmd ./devtools/get-maintainer.sh --cc dev@dpdk.org 000*.patch
+```
+
+### Required Tags
+
+```
+# For Coverity issues (required if "coverity" mentioned in body):
+Coverity issue: 12345
+
+# For Bugzilla issues (required if "bugzilla" mentioned in body):
+Bugzilla ID: 12345
+
+# For stable release backport candidates:
+Cc: stable@dpdk.org
+
+# For patch dependencies (in commit notes after ---):
+Depends-on: series-NNNNN ("Title of the series")
+```
+
+### Tag Order
+
+Tags must appear in this order:
+
+```
+Coverity issue:
+Bugzilla ID:
+Fixes:
+Cc:
+			  <-- blank line required here
+Reported-by:
+Suggested-by:
+Signed-off-by:
+Acked-by:
+Reviewed-by:
+Tested-by:
+```
+
+**Tag format**: `Tag-name: Full Name <email@domain.com>`
+
+---
+
+## C Coding Style
+
+### Line Length
+
+| Context | Limit |
+|---------|-------|
+| Source code | **100 characters** |
+| Commit body | **75 characters** |
+
+### General Formatting
+
+- **Tab width**: 8 characters (hard tabs for indentation, spaces for alignment)
+- **No trailing whitespace** on lines or at end of files
+- Files must end with a new line
+- Code style should be consistent within each file
+
+
+### Comments
+
+```c
+/* Most single-line comments look like this. */
+
+/*
+ * VERY important single-line comments look like this.
+ */
+
+/*
+ * Multi-line comments look like this. Make them real sentences. Fill
+ * them so they look like real paragraphs.
+ */
+```
+
+### Header File Organization
+
+Include order (each group separated by blank line):
+1. System/libc includes
+2. DPDK EAL includes
+3. DPDK misc library includes
+4. Application-specific includes
+
+```c
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <rte_eal.h>
+
+#include <rte_ring.h>
+#include <rte_mempool.h>
+
+#include "application.h"
+```
+
+### Header Guards
+
+```c
+#ifndef _FILE_H_
+#define _FILE_H_
+
+/* Code */
+
+#endif /* _FILE_H_ */
+```
+
+### Naming Conventions
+
+- **All external symbols** must have `RTE_` or `rte_` prefix
+- **Macros**: ALL_UPPERCASE with `RTE_` prefix
+- **Functions**: lowercase with underscores only (no CamelCase)
+- **Variables**: lowercase with underscores only
+- **Enum values**: ALL_UPPERCASE with `RTE_<ENUM>_` prefix
+
+**Exception**: Driver base directories (`drivers/*/base/`) may use different
+naming conventions when sharing code across platforms or with upstream vendor code.
+
+#### Symbol Naming for Static Linking
+
+Drivers and libraries must not expose global variables that could
+clash when statically linked with other DPDK components or
+applications. Use consistent and unique prefixes for all exported
+symbols to avoid namespace collisions.
+
+**Good practice**: Use a driver-specific or library-specific prefix for all global variables:
+
+```c
+/* Good - virtio driver uses consistent "virtio_" prefix */
+const struct virtio_ops virtio_legacy_ops = {
+	.read = virtio_legacy_read,
+	.write = virtio_legacy_write,
+	.configure = virtio_legacy_configure,
+};
+
+const struct virtio_ops virtio_modern_ops = {
+	.read = virtio_modern_read,
+	.write = virtio_modern_write,
+	.configure = virtio_modern_configure,
+};
+
+/* Good - mlx5 driver uses consistent "mlx5_" prefix */
+struct mlx5_flow_driver_ops mlx5_flow_dv_ops;
+```
+
+**Bad practice**: Generic names that may clash:
+
+```c
+/* Bad - "ops" is too generic, will clash with other drivers */
+const struct virtio_ops ops = { ... };
+
+/* Bad - "legacy_ops" could clash with other legacy implementations */
+const struct virtio_ops legacy_ops = { ... };
+
+/* Bad - "driver_config" is not unique */
+struct driver_config config;
+```
+
+**Guidelines**:
+- Prefix all global variables with the driver or library name (e.g., `virtio_`, `mlx5_`, `ixgbe_`)
+- Prefix all global functions similarly unless they use the `rte_` namespace
+- Internal static variables do not require prefixes as they have file scope
+- Consider using the `RTE_` or `rte_` prefix only for symbols that are part of the public DPDK API
+
+#### Prohibited Terminology
+
+Do not use non-inclusive naming including:
+- `master/slave` → Use: primary/secondary, controller/worker, leader/follower
+- `blacklist/whitelist` → Use: denylist/allowlist, blocklist/passlist
+- `cripple` → Use: impacted, degraded, restricted, immobolized
+- `tribe` → Use: team, squad
+- `sanity check` → Use: coherence check, test, verification
+
+
+### Comparisons and Boolean Logic
+
+```c
+/* Pointers - compare explicitly with NULL */
+if (p == NULL)      /* Good */
+if (p != NULL)      /* Good */
+if (!p)             /* Bad - don't use ! on pointers */
+
+/* Integers - compare explicitly with zero */
+if (a == 0)         /* Good */
+if (a != 0)         /* Good */
+if (!a)             /* Bad - don't use ! on integers */
+
+/* Characters - compare with character constant */
+if (*p == '\0')     /* Good */
+
+/* Booleans - direct test is acceptable */
+if (flag)           /* Good for actual bool types */
+if (!flag)          /* Good for actual bool types */
+```
+
+### Boolean Usage
+
+- Using `bool` type is allowed
+- Prefer `bool` over `int` when a variable or field is only used as a boolean
+- For structure fields, consider if the size/alignment impact is acceptable
+
+### Indentation and Braces
+
+```c
+/* Control statements - no braces for single statements */
+if (val != NULL)
+	val = realloc(val, newsize);
+
+/* Braces on same line as else */
+if (test)
+	stmt;
+else if (bar) {
+	stmt;
+	stmt;
+} else
+	stmt;
+
+/* Switch statements - don't indent case */
+switch (ch) {
+case 'a':
+	aflag = 1;
+	/* FALLTHROUGH */
+case 'b':
+	bflag = 1;
+	break;
+default:
+	usage();
+}
+
+/* Long conditions - double indent continuation */
+if (really_long_variable_name_1 == really_long_variable_name_2 &&
+		really_long_variable_name_3 == really_long_variable_name_4)
+	stmt;
+```
+
+### Variable Declarations
+
+- Prefer declaring variables inside the basic block where they are used
+- Variables may be declared either at the start of the block, or at point of first use (C99 style)
+- Both declaration styles are acceptable; consistency within a function is preferred
+- Initialize variables only when a meaningful value exists at declaration time
+- Use C99 designated initializers for structures
+
+```c
+/* Good - declaration at start of block */
+int ret;
+ret = some_function();
+
+/* Also good - declaration at point of use (C99 style) */
+for (int i = 0; i < count; i++)
+	process(i);
+
+/* Good - declaration in inner block where variable is used */
+if (condition) {
+	int local_val = compute();
+	use(local_val);
+}
+
+/* Bad - unnecessary initialization defeats compiler warnings */
+int ret = 0;
+ret = some_function();    /* Compiler won't warn if assignment removed */
+```
+
+### Function Format
+
+- Return type on its own line
+- Opening brace on its own line
+- Place an empty line between declarations and statements
+
+```c
+static char *
+function(int a1, int b1)
+{
+	char *p;
+
+	p = do_something(a1, b1);
+	return p;
+}
+```
+
+---
+
+## Unnecessary Code Patterns
+
+The following patterns add unnecessary code, hide bugs, or reduce performance. Avoid them.
+
+### Unnecessary Variable Initialization
+
+Do not initialize variables that will be assigned before use. This defeats the compiler's uninitialized variable warnings, hiding potential bugs.
+
+```c
+/* Bad - initialization defeats -Wuninitialized */
+int ret = 0;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - compiler will warn if any path misses assignment */
+int ret;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - meaningful initial value */
+int count = 0;
+for (i = 0; i < n; i++)
+	if (test(i))
+		count++;
+```
+
+### Unnecessary Casts of void *
+
+In C, `void *` converts implicitly to any pointer type. Casting the result of `malloc()`, `calloc()`, `rte_malloc()`, or similar functions is unnecessary and can hide the error of a missing `#include <stdlib.h>`.
+
+```c
+/* Bad - unnecessary cast */
+struct foo *p = (struct foo *)malloc(sizeof(*p));
+struct bar *q = (struct bar *)rte_malloc(NULL, sizeof(*q), 0);
+
+/* Good - no cast needed in C */
+struct foo *p = malloc(sizeof(*p));
+struct bar *q = rte_malloc(NULL, sizeof(*q), 0);
+```
+
+Note: Casts are required in C++ but DPDK is a C project.
+
+### Zero-Length Arrays vs Variable-Length Arrays
+
+Zero-length arrays (`int arr[0]`) are a GCC extension. Use C99 flexible array members instead.
+
+```c
+/* Bad - GCC extension */
+struct msg {
+	int len;
+	char data[0];
+};
+
+/* Good - C99 flexible array member */
+struct msg {
+	int len;
+	char data[];
+};
+```
+
+### Unnecessary NULL Checks Before free()
+
+Functions like `free()`, `rte_free()`, and similar deallocation functions accept NULL pointers safely. Do not add redundant NULL checks.
+
+```c
+/* Bad - unnecessary check */
+if (ptr != NULL)
+	free(ptr);
+
+if (rte_ptr != NULL)
+	rte_free(rte_ptr);
+
+/* Good - free handles NULL */
+free(ptr);
+rte_free(rte_ptr);
+```
+
+### memset Before free()
+
+Do not call `memset()` to zero memory before freeing it. The compiler may optimize away the `memset()` as a dead store. For security-sensitive data, use `rte_free_sensitive()` which ensures memory is cleared.
+
+```c
+/* Bad - compiler may eliminate memset */
+memset(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for non-sensitive data, just free */
+free(ptr);
+
+/* Good - for sensitive data, use secure free */
+rte_free_sensitive(secret_key);
+```
+
+### Appropriate Use of rte_malloc()
+
+`rte_malloc()` allocates from hugepage memory. Use it only when required:
+
+- Memory that will be accessed by DMA (NIC descriptors, packet buffers)
+- Memory shared between primary and secondary DPDK processes
+- Memory requiring specific NUMA node placement
+
+For general allocations, use standard `malloc()` which is faster and does not consume limited hugepage resources.
+
+```c
+/* Bad - rte_malloc for ordinary data structure */
+struct config *cfg = rte_malloc(NULL, sizeof(*cfg), 0);
+
+/* Good - standard malloc for control structures */
+struct config *cfg = malloc(sizeof(*cfg));
+
+/* Good - rte_malloc for DMA-accessible memory */
+struct rte_mbuf *mbufs = rte_malloc(NULL, n * sizeof(*mbufs), RTE_CACHE_LINE_SIZE);
+```
+
+### Appropriate Use of rte_memcpy()
+
+`rte_memcpy()` is optimized for bulk data transfer in the fast path. For general use, standard `memcpy()` is preferred because:
+
+- Modern compilers optimize `memcpy()` effectively
+- `memcpy()` includes bounds checking with `_FORTIFY_SOURCE`
+- `memcpy()` handles small fixed-size copies efficiently
+
+```c
+/* Bad - rte_memcpy in control path */
+rte_memcpy(&config, &default_config, sizeof(config));
+
+/* Good - standard memcpy for control path */
+memcpy(&config, &default_config, sizeof(config));
+
+/* Good - rte_memcpy for packet data in fast path */
+rte_memcpy(rte_pktmbuf_mtod(m, void *), payload, len);
+```
+
+---
+
+## Forbidden Tokens
+
+### Functions
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `rte_panic()` | Return error codes | lib/, drivers/ |
+| `rte_exit()` | Return error codes | lib/, drivers/ |
+| `perror()` | `RTE_LOG()` with `strerror(errno)` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `printf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `fprintf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+
+### Atomics and Memory Barriers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `rte_atomic16/32/64_xxx()` | C11 atomics via `rte_atomic_xxx()` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence()` |
+| `__sync_xxx()` | `rte_atomic_xxx()` |
+| `__atomic_xxx()` | `rte_atomic_xxx()` |
+| `__ATOMIC_RELAXED` etc. | `rte_memory_order_xxx` |
+| `__rte_atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+
+### Threading
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `pthread_create()` | `rte_thread_create()` |
+| `pthread_join()` | `rte_thread_join()` |
+| `pthread_detach()` | EAL thread functions |
+| `pthread_setaffinity_np()` | `rte_thread_set_affinity()` |
+| `rte_thread_set_name()` | `rte_thread_set_prefixed_name()` |
+| `rte_thread_create_control()` | `rte_thread_create_internal_control()` |
+
+### Compiler Built-ins and Attributes
+
+| Forbidden | Preferred | Notes |
+|-----------|-----------|-------|
+| `__attribute__` | RTE macros in `rte_common.h` | Except in `lib/eal/include/rte_common.h` |
+| `__alignof__` | C11 `alignof` | |
+| `__typeof__` | `typeof` | |
+| `__builtin_*` | EAL macros | Except in `lib/eal/` and `drivers/*/base/` |
+| `__reserved` | Different name | Reserved in Windows headers |
+| `#pragma` / `_Pragma` | Avoid | Except in `rte_common.h` |
+
+### Format Specifiers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `%lld`, `%llu`, `%llx` | `%PRId64`, `%PRIu64`, `%PRIx64` |
+
+### Headers and Build
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `#include <linux/pci_regs.h>` | `#include <rte_pci.h>` | |
+| `install_headers()` | Meson `headers` variable | meson.build |
+| `-DALLOW_EXPERIMENTAL_API` | Not in lib/drivers/app | Build flags |
+| `allow_experimental_apis` | Not in lib/drivers/app | Meson |
+| `#undef XXX` | `// XXX is not set` | config/rte_config.h |
+| Driver headers (`*_driver.h`, `*_pmd.h`) | Public API headers | app/, examples/ |
+
+### Testing
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `REGISTER_TEST_COMMAND` | `REGISTER_<suite_name>_TEST` |
+
+### Documentation
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `http://...dpdk.org` | `https://...dpdk.org` |
+| `//doc.dpdk.org/guides/...` | `:ref:` or `:doc:` Sphinx references |
+| `::  file.svg` | `::  file.*` (wildcard extension) |
+
+---
+
+## API Tag Requirements
+
+### `__rte_experimental`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_experimental
+int
+rte_new_feature(void);
+
+/* Wrong - not alone on line */
+__rte_experimental int rte_new_feature(void);
+
+/* Wrong - in .c file */
+```
+
+### `__rte_internal`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_internal
+int
+internal_function(void);
+```
+
+### Alignment Attributes
+
+`__rte_aligned`, `__rte_cache_aligned`, `__rte_cache_min_aligned` may only be used with `struct` or `union` types:
+
+```c
+/* Correct */
+struct __rte_cache_aligned my_struct {
+	/* ... */
+};
+
+/* Wrong */
+int __rte_cache_aligned my_variable;
+```
+
+### Packed Attributes
+
+- `__rte_packed_begin` must follow `struct`, `union`, or alignment attributes
+- `__rte_packed_begin` and `__rte_packed_end` must be used in pairs
+- Cannot use `__rte_packed_begin` with `enum`
+
+```c
+/* Correct */
+struct __rte_packed_begin my_packed_struct {
+	/* ... */
+} __rte_packed_end;
+
+/* Wrong - with enum */
+enum __rte_packed_begin my_enum {
+	/* ... */
+};
+```
+
+---
+
+## Code Quality Requirements
+
+### Compilation
+
+- Each commit must compile independently (for `git bisect`)
+- No forward dependencies within a patchset
+- Test with multiple targets, compilers, and options
+- Use `devtools/test-meson-builds.sh`
+
+### Testing
+
+- Add tests to `app/test` unit test framework
+- New API functions must be used in `/app` test directory
+- New device APIs require at least one driver implementation
+
+#### Functional Test Infrastructure
+
+Standalone functional tests should use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure for consistency and proper integration with the DPDK test framework.
+
+```c
+#include <rte_test.h>
+
+static int
+test_feature_basic(void)
+{
+	int ret;
+
+	ret = rte_feature_init();
+	TEST_ASSERT_SUCCESS(ret, "Failed to initialize feature");
+
+	ret = rte_feature_operation();
+	TEST_ASSERT_EQUAL(ret, 0, "Operation returned unexpected value");
+
+	TEST_ASSERT_NOT_NULL(rte_feature_get_ptr(),
+		"Feature pointer should not be NULL");
+
+	return TEST_SUCCESS;
+}
+
+static struct unit_test_suite feature_testsuite = {
+	.suite_name = "feature_autotest",
+	.setup = test_feature_setup,
+	.teardown = test_feature_teardown,
+	.unit_test_cases = {
+		TEST_CASE(test_feature_basic),
+		TEST_CASE(test_feature_advanced),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_feature(void)
+{
+	return unit_test_suite_runner(&feature_testsuite);
+}
+
+REGISTER_FAST_TEST(feature_autotest, true, true, test_feature);
+```
+
+Common `TEST_ASSERT` macros:
+- `TEST_ASSERT(cond, msg, ...)` - Assert condition is true
+- `TEST_ASSERT_SUCCESS(val, msg, ...)` - Assert value equals 0
+- `TEST_ASSERT_FAIL(val, msg, ...)` - Assert value is non-zero
+- `TEST_ASSERT_EQUAL(a, b, msg, ...)` - Assert two values are equal
+- `TEST_ASSERT_NOT_EQUAL(a, b, msg, ...)` - Assert two values differ
+- `TEST_ASSERT_NULL(val, msg, ...)` - Assert value is NULL
+- `TEST_ASSERT_NOT_NULL(val, msg, ...)` - Assert value is not NULL
+
+### Documentation
+
+- Add Doxygen comments for public APIs
+- Update release notes in `doc/guides/rel_notes/` for important changes
+- Code and documentation must be updated atomically in same patch
+- Only update the **current release** notes file
+- Documentation must match the code
+- PMD features must match the features matrix in `doc/guides/nics/features/`
+- Documentation must match device operations (see `doc/guides/nics/features.rst` for the mapping between features, `eth_dev_ops`, and related APIs)
+
+### API and Driver Changes
+
+- New APIs must be marked as `__rte_experimental`
+- New APIs must have hooks in `app/testpmd` and tests in the functional test suite
+- Changes to existing APIs require release notes
+- New drivers or subsystems must have release notes
+
+### ABI Compatibility
+
+- New external functions must be exported properly
+- Follow ABI policy and versioning guidelines
+- Enable ABI checks with `DPDK_ABI_REF_VERSION` environment variable
+
+---
+
+## Patch Validation Checklist
+
+### Commit Message
+
+- [ ] Subject line ≤60 characters
+- [ ] Subject is lowercase (except acronyms from words-case.txt)
+- [ ] Correct component prefix (e.g., `net/ixgbe:` not `ixgbe:`)
+- [ ] No `lib/` prefix for libraries
+- [ ] Imperative mood, no trailing period
+- [ ] No tabs, leading/trailing spaces, or punctuation marks
+- [ ] Body wrapped at 75 characters
+- [ ] Body does not start with "It"
+- [ ] `Signed-off-by:` present with real name and valid email
+- [ ] `Fixes:` tag present for bug fixes with 12-char SHA and exact subject
+- [ ] `Coverity issue:` tag present if Coverity mentioned
+- [ ] `Bugzilla ID:` tag present if Bugzilla mentioned
+- [ ] `Cc: stable@dpdk.org` for stable backport candidates
+- [ ] Tags in correct order with blank line separator
+
+### License
+
+- [ ] SPDX identifier on first line (or second for scripts)
+- [ ] Copyright line follows SPDX
+- [ ] Blank line after copyright before code
+- [ ] Appropriate license for file type
+
+### Code Style
+
+- [ ] Lines ≤100 characters
+- [ ] Hard tabs for indentation, spaces for alignment
+- [ ] No trailing whitespace
+- [ ] Proper include order
+- [ ] Header guards present
+- [ ] `rte_`/`RTE_` prefix on external symbols
+- [ ] Driver/library global variables use unique prefixes (e.g., `virtio_`, `mlx5_`)
+- [ ] No prohibited terminology
+- [ ] Proper brace style
+- [ ] Function return type on own line
+- [ ] Explicit comparisons: `== NULL`, `== 0`, `!= NULL`, `!= 0`
+- [ ] No forbidden tokens (see table above)
+- [ ] No unnecessary code patterns (see section above)
+
+### API Tags
+
+- [ ] `__rte_experimental` alone on line, only in headers
+- [ ] `__rte_internal` alone on line, only in headers
+- [ ] Alignment attributes only on struct/union
+- [ ] Packed attributes properly paired
+
+### Structure
+
+- [ ] Each commit compiles independently
+- [ ] Code and docs updated together
+- [ ] Documentation matches code behavior
+- [ ] PMD features match `doc/guides/nics/features/` matrix
+- [ ] Device operations match documentation (per `features.rst` mappings)
+- [ ] Tests added/updated as needed
+- [ ] Functional tests use TEST_ASSERT macros and unit_test_suite_runner
+- [ ] New APIs marked as `__rte_experimental`
+- [ ] New APIs have testpmd hooks and functional tests
+- [ ] Current release notes updated for significant changes
+- [ ] Release notes updated for API changes
+- [ ] Release notes updated for new drivers or subsystems
+
+---
+
+## Meson Build Files
+
+### Style Requirements
+
+- 4-space indentation (no tabs)
+- Line continuations double-indented
+- Lists alphabetically ordered
+- Short lists (≤3 items): single line, no trailing comma
+- Long lists: one item per line, trailing comma on last item
+
+```python
+# Short list
+sources = files('file1.c', 'file2.c')
+
+# Long list
+headers = files(
+	'header1.h',
+	'header2.h',
+	'header3.h',
+)
+```
+
+---
+
+## Python Code
+
+- Must comply with formatting standards
+- Use **`black`** for code formatting validation
+- Line length acceptable up to 100 characters
+
+---
+
+## Validation Tools
+
+Run these before submitting:
+
+```bash
+# Check commit messages
+devtools/check-git-log.sh -n1
+
+# Check patch format and forbidden tokens
+devtools/checkpatches.sh -n1
+
+# Check maintainers coverage
+devtools/check-maintainers.sh
+
+# Build validation
+devtools/test-meson-builds.sh
+
+# Find maintainers for your patch
+devtools/get-maintainer.sh <patch-file>
+```
+
+---
+
+## Severity Levels for AI Review
+
+**Error** (must fix):
+- Missing or malformed SPDX license
+- Missing Signed-off-by
+- Subject line over 60 characters
+- Body lines over 75 characters
+- Wrong tag order or format
+- Missing required tags (Fixes, Coverity issue, Bugzilla ID)
+- Forbidden tokens in code
+- `__rte_experimental`/`__rte_internal` in .c files or not alone on line
+- Compilation failures
+- ABI breaks without proper versioning
+
+**Warning** (should fix):
+- Subject line style issues (case, punctuation)
+- Wrong component prefix
+- Missing Cc: stable@dpdk.org for fixes
+- Documentation gaps
+- Documentation does not match code behavior
+- PMD features missing from `doc/guides/nics/features/` matrix
+- Device operations not documented per `features.rst` mappings
+- Missing tests
+- Functional tests not using TEST_ASSERT macros or unit_test_suite_runner
+- New API not marked as `__rte_experimental`
+- New API without testpmd hooks or functional tests
+- API changes without release notes
+- New drivers or subsystems without release notes
+- Implicit comparisons (`!ptr` instead of `ptr == NULL`)
+- Unnecessary variable initialization
+- Unnecessary casts of `void *`
+- Unnecessary NULL checks before free
+- Inappropriate use of `rte_malloc()` or `rte_memcpy()`
+- Use of `perror()`, `printf()`, `fprintf()` in libraries or drivers (allowed in examples and test code)
+- Driver/library global variables without unique prefixes (static linking clash risk)
+
+**Info** (consider):
+- Minor style preferences
+- Optimization suggestions
+- Alternative approaches
+
+---
+
+# Response Format
+
+When you identify an issue:
+1. **State the problem** (1 sentence)
+2. **Why it matters** (1 sentence, only if not obvious)
+3. **Suggested fix** (code snippet or specific action)
+
+Example:
+This could panic if the string is NULL.
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v7 2/4] devtools: add multi-provider AI patch review script
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
  2026-01-26 18:40   ` [PATCH v7 1/4] doc: add AGENTS.md for AI-powered " Stephen Hemminger
@ 2026-01-26 18:40   ` Stephen Hemminger
  2026-01-26 18:40   ` [PATCH v7 3/4] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-01-26 18:40 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This is an AI generated script to review DPDK patches against
the AGENTS.md coding guidelines using AI language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script reads a patch file and the AGENTS.md guidelines, then
submits them to the selected AI provider for review. Results are
organized by severity level (Error, Warning, Info) as defined in
the guidelines.

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large patches
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/analyze-patch.py 0001-net-ixgbe-fix-something.patch
  ./devtools/analyze-patch.py -p xai my-patch.patch
  ./devtools/analyze-patch.py -l  # list providers

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/analyze-patch.py | 731 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 731 insertions(+)
 create mode 100755 devtools/analyze-patch.py

diff --git a/devtools/analyze-patch.py b/devtools/analyze-patch.py
new file mode 100755
index 0000000000..100a43ea9a
--- /dev/null
+++ b/devtools/analyze-patch.py
@@ -0,0 +1,731 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Analyze DPDK patches using AI providers.
+
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from email.message import EmailMessage
+from pathlib import Path
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4o",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-3",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-2.0-flash",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+SYSTEM_PROMPT = """You are an expert DPDK code reviewer. Analyze patches for \
+compliance with DPDK coding standards and contribution guidelines. Provide \
+clear, actionable feedback organized by severity (Error, Warning, Info) as \
+defined in the guidelines."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """Provide your review in plain text format.""",
+    "markdown": """Provide your review in Markdown format with:
+- Headers (##) for each severity level (Errors, Warnings, Info)
+- Bullet points for individual issues
+- Code blocks (```) for code references
+- Bold (**) for emphasis on key points""",
+    "html": """Provide your review in HTML format with:
+- <h2> tags for each severity level (Errors, Warnings, Info)
+- <ul>/<li> for individual issues
+- <pre><code> for code references
+- <strong> for emphasis on key points
+- Use appropriate semantic HTML tags
+- Do NOT include <html>, <head>, or <body> tags - just the content""",
+    "json": """Provide your review in JSON format with this structure:
+{
+  "summary": "Brief one-line summary of the review",
+  "errors": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "warnings": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "info": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "passed_checks": ["list of checks that passed"],
+  "overall_status": "PASS|WARN|FAIL"
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """Please review the following DPDK patch file '{patch_name}' \
+against the AGENTS.md guidelines. Check for:
+
+1. Commit message format (subject line, body, tags)
+2. License/copyright compliance
+3. C coding style issues
+4. API and documentation requirements
+5. Any other guideline violations
+
+{format_instruction}
+
+--- PATCH CONTENT ---
+"""
+
+
+def error(msg):
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key):
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def build_anthropic_request(
+    model, max_tokens, agents_content, patch_content, patch_name, output_format="text"
+):
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model, max_tokens, agents_content, patch_content, patch_name, output_format="text"
+):
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens, agents_content, patch_content, patch_name, output_format="text"
+):
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": SYSTEM_PROMPT}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + patch_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider,
+    api_key,
+    model,
+    max_tokens,
+    agents_content,
+    patch_content,
+    patch_name,
+    output_format="text",
+    verbose=False,
+):
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model, max_tokens, agents_content, patch_content, patch_name, output_format
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens, agents_content, patch_content, patch_name, output_format
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model, max_tokens, agents_content, patch_content, patch_name, output_format
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: " f"{usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: " f"{usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def get_last_message_id(patch_content):
+    """Extract Message-ID from the last patch in an mbox."""
+    msg_ids = re.findall(
+        r"^Message-I[Dd]:\s*(.+)$", patch_content, re.MULTILINE | re.IGNORECASE
+    )
+    if msg_ids:
+        msg_id = msg_ids[-1].strip()
+        # Normalize: remove < > and add them back
+        msg_id = msg_id.strip("<>")
+        return f"<{msg_id}>"
+    return None
+
+
+def get_last_subject(patch_content):
+    """Extract subject from the last patch in an mbox."""
+    # Find all Subject lines with potential continuations
+    subjects = []
+    lines = patch_content.split("\n")
+    i = 0
+    while i < len(lines):
+        if lines[i].lower().startswith("subject:"):
+            subject = lines[i][8:].strip()
+            i += 1
+            # Handle continuation lines
+            while i < len(lines) and lines[i].startswith((" ", "\t")):
+                subject += lines[i].strip()
+                i += 1
+            subjects.append(subject)
+        else:
+            i += 1
+    return subjects[-1] if subjects else None
+
+
+def send_email(
+    to_addrs, cc_addrs, from_addr, subject, in_reply_to, body, dry_run=False
+):
+    """Send review email using git send-email, sendmail, or msmtp."""
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    email_text = msg.as_string()
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(email_text, file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Write to temp file for git send-email
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".eml", delete=False) as f:
+        f.write(email_text)
+        temp_file = f.name
+
+    try:
+        # Try git send-email first
+        if get_git_config("sendemail.smtpserver"):
+            # Build command with all arguments
+            flat_cmd = ["git", "send-email", "--confirm=never", "--quiet"]
+            for addr in to_addrs:
+                flat_cmd.extend(["--to", addr])
+            for addr in cc_addrs:
+                flat_cmd.extend(["--cc", addr])
+            if from_addr:
+                flat_cmd.extend(["--from", from_addr])
+            if in_reply_to:
+                flat_cmd.extend(["--in-reply-to", in_reply_to])
+            flat_cmd.append(temp_file)
+
+            try:
+                subprocess.run(flat_cmd, check=True, capture_output=True)
+                print("Email sent via git send-email", file=sys.stderr)
+                return True
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+
+        # Try sendmail
+        try:
+            subprocess.run(
+                ["sendmail", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via sendmail", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        # Try msmtp
+        try:
+            subprocess.run(
+                ["msmtp", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via msmtp", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        error("Could not send email. Configure git send-email, sendmail, " "or msmtp.")
+
+    finally:
+        os.unlink(temp_file)
+
+
+def list_providers():
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Analyze DPDK patches using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s patch.patch                    # Review with default settings
+    %(prog)s -p openai my-patch.patch       # Use OpenAI ChatGPT
+    %(prog)s -f markdown patch.patch        # Output as Markdown
+    %(prog)s -f json -o review.json patch.patch  # Save JSON to file
+    %(prog)s -f html -o review.html patch.patch  # Save HTML to file
+    %(prog)s --send-email --to dev@dpdk.org series.mbox
+    %(prog)s --send-email --to dev@dpdk.org --dry-run series.mbox
+        """,
+    )
+
+    parser.add_argument("patch_file", nargs="?", help="Patch file to analyze")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=4096,
+        help="Max tokens for response (default: 4096)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output",
+        metavar="FILE",
+        help="Write output to file instead of stdout",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check patch file is provided
+    if not args.patch_file:
+        parser.error("patch_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    patch_path = Path(args.patch_file)
+    if not patch_path.exists():
+        error(f"Patch file not found: {args.patch_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Read files
+    agents_content = agents_path.read_text()
+    patch_content = patch_path.read_text()
+    patch_name = patch_path.name
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Patch file: {args.patch_file}", file=sys.stderr)
+        if args.output:
+            print(f"Output file: {args.output}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API
+    review_text = call_api(
+        args.provider,
+        api_key,
+        model,
+        args.tokens,
+        agents_content,
+        patch_content,
+        patch_name,
+        args.output_format,
+        args.verbose,
+    )
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Format output based on requested format
+    provider_name = config["name"]
+
+    if args.output_format == "json":
+        # For JSON, try to parse and add metadata
+        try:
+            review_data = json.loads(review_text)
+        except json.JSONDecodeError:
+            # If AI didn't return valid JSON, wrap the text
+            review_data = {"raw_review": review_text}
+
+        output_data = {
+            "metadata": {
+                "patch_file": patch_name,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+    elif args.output_format == "html":
+        # Wrap HTML content with header
+        output_text = f"""<!-- AI-generated review of {patch_name} -->
+<!-- Reviewed using {provider_name} ({model}) -->
+<div class="patch-review">
+<h1>Patch Review: {patch_name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+{review_text}
+</div>
+"""
+    elif args.output_format == "markdown":
+        output_text = f"""# Patch Review: {patch_name}
+
+*Reviewed by {provider_name} ({model})*
+
+{review_text}
+"""
+    else:  # text
+        output_text = f"=== Patch Review: {patch_name} (via {provider_name}) ===\n\n"
+        output_text += review_text
+
+    # Write output
+    if args.output:
+        Path(args.output).write_text(output_text)
+        print(f"Review written to: {args.output}", file=sys.stderr)
+    else:
+        print(output_text)
+
+    # Send email if requested
+    if args.send_email:
+        # Email always uses plain text - warn if different format requested
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        in_reply_to = get_last_message_id(patch_content)
+        orig_subject = get_last_subject(patch_content)
+
+        if orig_subject:
+            # Remove [PATCH n/m] prefix
+            review_subject = re.sub(r"^\[PATCH[^\]]*\]\s*", "", orig_subject)
+            review_subject = f"[REVIEW] {review_subject}"
+        else:
+            review_subject = f"[REVIEW] {patch_name}"
+
+        # Build email body - always use plain text version
+        email_body = f"""AI-generated review of {patch_name}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print(f"In-Reply-To: {in_reply_to}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            in_reply_to,
+            email_body,
+            args.dry_run,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v7 3/4] devtools: add compare-reviews.sh for multi-provider analysis
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
  2026-01-26 18:40   ` [PATCH v7 1/4] doc: add AGENTS.md for AI-powered " Stephen Hemminger
  2026-01-26 18:40   ` [PATCH v7 2/4] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-01-26 18:40   ` Stephen Hemminger
  2026-01-26 18:40   ` [PATCH v7 4/4] devtools: add multi-provider AI documentation review script Stephen Hemminger
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-01-26 18:40 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 192 ++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100644 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100644
index 0000000000..a63eeffb71
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,192 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -e
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+VERBOSE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            FORMAT="$2"
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"
+        echo ""
+        echo "Saved to: $OUTPUT_FILE"
+    else
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE"
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v7 4/4] devtools: add multi-provider AI documentation review script
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
                     ` (2 preceding siblings ...)
  2026-01-26 18:40   ` [PATCH v7 3/4] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
@ 2026-01-26 18:40   ` Stephen Hemminger
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-01-26 18:40 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add review-doc.sh script that reviews DPDK documentation files for
spelling, grammar, technical correctness, and clarity using AI
language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script produces two output files:
  - A unified diff with suggested changes
  - A commit message following DPDK standards

The commit message prefix is automatically determined from the
file path (e.g., doc/guides/prog_guide: for programmer's guide
files).

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Configurable output directory via -o/--output option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large documents
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/review-doc.py doc/guides/prog_guide/mempool_lib.rst
  ./devtools/review-doc.py -p xai doc/guides/nics/ixgbe.rst
  git apply mempool_lib.diff && git commit -sF mempool_lib.msg

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/review-doc.py | 974 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 974 insertions(+)
 create mode 100755 devtools/review-doc.py

diff --git a/devtools/review-doc.py b/devtools/review-doc.py
new file mode 100755
index 0000000000..7fe95b88b1
--- /dev/null
+++ b/devtools/review-doc.py
@@ -0,0 +1,974 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Review DPDK documentation files using AI providers.
+
+Produces a diff file and commit message compliant with DPDK standards.
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import getpass
+import json
+import os
+import re
+import smtplib
+import ssl
+import subprocess
+import sys
+from email.message import EmailMessage
+from pathlib import Path
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4o",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-3",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-2.0-flash",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# Commit prefix mappings based on file path
+COMMIT_PREFIX_MAP = [
+    ("doc/guides/prog_guide/", "doc/guides/prog_guide:"),
+    ("doc/guides/sample_app_ug/", "doc/guides/sample_app:"),
+    ("doc/guides/nics/", "doc/guides/nics:"),
+    ("doc/guides/cryptodevs/", "doc/guides/cryptodevs:"),
+    ("doc/guides/compressdevs/", "doc/guides/compressdevs:"),
+    ("doc/guides/eventdevs/", "doc/guides/eventdevs:"),
+    ("doc/guides/rawdevs/", "doc/guides/rawdevs:"),
+    ("doc/guides/bbdevs/", "doc/guides/bbdevs:"),
+    ("doc/guides/gpus/", "doc/guides/gpus:"),
+    ("doc/guides/dmadevs/", "doc/guides/dmadevs:"),
+    ("doc/guides/regexdevs/", "doc/guides/regexdevs:"),
+    ("doc/guides/mldevs/", "doc/guides/mldevs:"),
+    ("doc/guides/rel_notes/", "doc/guides/rel_notes:"),
+    ("doc/guides/linux_gsg/", "doc/guides/linux_gsg:"),
+    ("doc/guides/freebsd_gsg/", "doc/guides/freebsd_gsg:"),
+    ("doc/guides/windows_gsg/", "doc/guides/windows_gsg:"),
+    ("doc/guides/tools/", "doc/guides/tools:"),
+    ("doc/guides/testpmd_app_ug/", "doc/guides/testpmd:"),
+    ("doc/guides/howto/", "doc/guides/howto:"),
+    ("doc/guides/contributing/", "doc/guides/contributing:"),
+    ("doc/guides/platform/", "doc/guides/platform:"),
+    ("doc/guides/", "doc:"),
+    ("doc/api/", "doc/api:"),
+    ("doc/", "doc:"),
+]
+
+SYSTEM_PROMPT = """\
+You are an expert technical documentation reviewer for DPDK.
+Your task is to review documentation files and suggest improvements for:
+- Spelling errors
+- Grammar issues
+- Technical correctness
+- Clarity and readability
+- Consistency with DPDK terminology
+
+IMPORTANT COMMIT MESSAGE RULES (from check-git-log.sh):
+- Subject line MUST be ≤60 characters
+- Format: "prefix: lowercase description"
+- First word after colon must be lowercase (except acronyms like Rx, Tx, VF, MAC, API)
+- Use imperative mood (e.g., "fix typo" not "fixed typo" or "fixes typo")
+- NO trailing period on subject line
+- NO punctuation marks: , ; ! ? & |
+- NO underscores in subject after colon
+- Body lines wrapped at 75 characters
+- Body must NOT start with "It"
+- Do NOT include Signed-off-by (user adds via git commit --sign)
+- Only use "Fixes:" tag for actual errors in documentation, not style improvements
+
+Case-sensitive terms (must use exact case):
+- Rx, Tx (not RX, TX, rx, tx)
+- VF, PF (not vf, pf)
+- MAC, VLAN, RSS, API
+- Linux, Windows, FreeBSD
+
+For style/clarity improvements, do NOT use Fixes tag.
+For actual errors (wrong information, broken examples), include Fixes tag \
+if you can identify the commit."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """
+OUTPUT FORMAT:
+You must output exactly two sections:
+
+1. COMMIT_MESSAGE section containing the complete commit message
+2. UNIFIED_DIFF section containing the unified diff
+
+Use these exact markers:
+---COMMIT_MESSAGE_START---
+(commit message here)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(unified diff here)
+---UNIFIED_DIFF_END---
+
+The diff should be in unified format that can be applied with "git apply".
+If no changes are needed, output empty sections with a note.""",
+    "markdown": """
+OUTPUT FORMAT:
+Provide your review in Markdown format with:
+
+## Summary
+Brief description of changes
+
+## Commit Message
+```
+(complete commit message here, ready to use)
+```
+
+## Changes
+For each change:
+### Issue N: Brief title
+- **Location**: file path and line
+- **Problem**: description
+- **Fix**: suggested correction
+
+## Unified Diff
+```diff
+(unified diff here)
+```""",
+    "html": """
+OUTPUT FORMAT:
+Provide your review in HTML format with:
+- <h2> for sections (Summary, Commit Message, Changes, Diff)
+- <pre><code> for commit message and diff
+- <ul>/<li> for individual issues
+- Do NOT include <html>, <head>, or <body> tags - just the content
+
+Include sections for: Summary, Commit Message, Changes, Unified Diff""",
+    "json": """
+OUTPUT FORMAT:
+Provide your review as JSON with this structure:
+{
+  "summary": "Brief description of changes",
+  "commit_message": "Complete commit message ready to use",
+  "changes": [
+    {
+      "type": "spelling|grammar|technical|clarity|style",
+      "location": "line number or section",
+      "original": "original text",
+      "suggested": "corrected text",
+      "reason": "why this change"
+    }
+  ],
+  "diff": "unified diff as a string",
+  "stats": {
+    "total_issues": 0,
+    "spelling": 0,
+    "grammar": 0,
+    "technical": 0,
+    "clarity": 0
+  }
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """\
+Review the following DPDK documentation file and provide improvements.
+
+File path: {doc_file}
+Commit message prefix to use: {commit_prefix}
+
+{format_instruction}
+
+---DOCUMENT CONTENT---
+"""
+
+
+def error(msg):
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key):
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_smtp_config():
+    """Get SMTP configuration from git config sendemail settings."""
+    config = {
+        "server": get_git_config("sendemail.smtpserver"),
+        "port": get_git_config("sendemail.smtpserverport"),
+        "user": get_git_config("sendemail.smtpuser"),
+        "encryption": get_git_config("sendemail.smtpencryption"),
+        "password": get_git_config("sendemail.smtppass"),
+    }
+
+    # Set defaults
+    if not config["port"]:
+        if config["encryption"] == "ssl":
+            config["port"] = "465"
+        else:
+            config["port"] = "587"
+
+    # Convert port to int
+    if config["port"]:
+        config["port"] = int(config["port"])
+
+    return config
+
+
+def get_commit_prefix(filepath):
+    """Determine commit message prefix from file path."""
+    for prefix_path, prefix in COMMIT_PREFIX_MAP:
+        if filepath.startswith(prefix_path):
+            return prefix
+    return "doc:"
+
+
+def build_anthropic_request(
+    model,
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+):
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model,
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+):
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+):
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": SYSTEM_PROMPT}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + doc_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider,
+    api_key,
+    model,
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+    verbose=False,
+):
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: " f"{usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: " f"{usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def parse_review_text(review_text):
+    """Extract commit message and diff from text format response."""
+    commit_msg = ""
+    diff = ""
+
+    # Extract commit message
+    msg_match = re.search(
+        r"---COMMIT_MESSAGE_START---\s*\n(.*?)\n---COMMIT_MESSAGE_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if msg_match:
+        commit_msg = msg_match.group(1).strip()
+
+    # Extract unified diff
+    diff_match = re.search(
+        r"---UNIFIED_DIFF_START---\s*\n(.*?)\n---UNIFIED_DIFF_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if diff_match:
+        diff = diff_match.group(1).strip()
+        # Clean up any markdown code fence if present
+        diff = re.sub(r"^```diff\s*\n?", "", diff)
+        diff = re.sub(r"\n?```\s*$", "", diff)
+
+    return commit_msg, diff
+
+
+def send_email(
+    to_addrs,
+    cc_addrs,
+    from_addr,
+    subject,
+    in_reply_to,
+    body,
+    dry_run=False,
+    verbose=False,
+):
+    """Send review email via SMTP using git sendemail config."""
+    # Build email message
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(msg.as_string(), file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Get SMTP configuration from git config
+    smtp_config = get_smtp_config()
+
+    if not smtp_config["server"]:
+        error("No SMTP server configured. Set git config sendemail.smtpserver")
+
+    server = smtp_config["server"]
+    port = smtp_config["port"]
+    user = smtp_config["user"]
+    encryption = smtp_config["encryption"]
+
+    # Get password from environment or git config, or prompt
+    password = os.environ.get("SMTP_PASSWORD") or smtp_config["password"]
+    if user and not password:
+        password = getpass.getpass(f"SMTP password for {user}@{server}: ")
+
+    if verbose:
+        print(f"SMTP server: {server}:{port}", file=sys.stderr)
+        print(f"SMTP user: {user or '(none)'}", file=sys.stderr)
+        print(f"Encryption: {encryption or 'starttls'}", file=sys.stderr)
+
+    # Collect all recipients
+    all_recipients = list(to_addrs)
+    if cc_addrs:
+        all_recipients.extend(cc_addrs)
+
+    try:
+        if encryption == "ssl":
+            # SSL/TLS connection from the start (port 465)
+            context = ssl.create_default_context()
+            with smtplib.SMTP_SSL(server, port, context=context) as smtp:
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+        else:
+            # STARTTLS (port 587) or plain (port 25)
+            with smtplib.SMTP(server, port) as smtp:
+                smtp.ehlo()
+                if encryption == "tls" or port == 587:
+                    context = ssl.create_default_context()
+                    smtp.starttls(context=context)
+                    smtp.ehlo()
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+
+        print(f"Email sent via SMTP ({server}:{port})", file=sys.stderr)
+        return True
+
+    except smtplib.SMTPAuthenticationError as e:
+        error(f"SMTP authentication failed: {e}")
+    except smtplib.SMTPException as e:
+        error(f"SMTP error: {e}")
+    except OSError as e:
+        error(f"Connection error to {server}:{port}: {e}")
+
+
+def list_providers():
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Review DPDK documentation files using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s doc/guides/prog_guide/mempool_lib.rst
+    %(prog)s -p openai -o /tmp doc/guides/nics/ixgbe.rst
+    %(prog)s -f markdown doc/guides/cryptodevs/qat.rst
+    %(prog)s -f json -O review.json doc/guides/howto/flow_bifurcation.rst
+    %(prog)s --send-email --to dev@dpdk.org doc/guides/nics/ixgbe.rst
+
+After review:
+    git apply <basename>.diff
+    git commit -sF <basename>.msg
+
+SMTP Configuration (from git config):
+    sendemail.smtpserver      SMTP server hostname
+    sendemail.smtpserverport  SMTP port (default: 587 for TLS, 465 for SSL)
+    sendemail.smtpuser        SMTP username
+    sendemail.smtpencryption  'tls' for STARTTLS, 'ssl' for SSL/TLS
+    sendemail.smtppass        SMTP password (or set SMTP_PASSWORD env var)
+
+Example git config:
+    git config --global sendemail.smtpserver smtp.gmail.com
+    git config --global sendemail.smtpserverport 587
+    git config --global sendemail.smtpuser yourname@gmail.com
+    git config --global sendemail.smtpencryption tls
+        """,
+    )
+
+    parser.add_argument("doc_file", nargs="?", help="Documentation file")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=8192,
+        help="Max tokens for response (default: 8192)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output-dir",
+        default=".",
+        help="Output directory for .diff and .msg files (default: .)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-O",
+        "--output-file",
+        metavar="FILE",
+        help="Write full review to file (in addition to .diff/.msg)",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check doc file is provided
+    if not args.doc_file:
+        parser.error("doc_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    doc_path = Path(args.doc_file)
+    if not doc_path.exists():
+        error(f"Documentation file not found: {args.doc_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Determine output filenames
+    doc_basename = doc_path.stem
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    diff_file = output_dir / f"{doc_basename}.diff"
+    msg_file = output_dir / f"{doc_basename}.msg"
+
+    # Get commit prefix
+    commit_prefix = get_commit_prefix(args.doc_file)
+
+    # Read files
+    agents_content = agents_path.read_text()
+    doc_content = doc_path.read_text()
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Doc file: {args.doc_file}", file=sys.stderr)
+        print(f"Commit prefix: {commit_prefix}", file=sys.stderr)
+        print(f"Output dir: {args.output_dir}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API
+    provider_name = config["name"]
+    review_text = call_api(
+        args.provider,
+        api_key,
+        model,
+        args.tokens,
+        agents_content,
+        doc_content,
+        args.doc_file,
+        commit_prefix,
+        args.output_format,
+        args.verbose,
+    )
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Process based on output format
+    if args.output_format == "text":
+        # Parse and write diff/msg files
+        commit_msg, diff = parse_review_text(review_text)
+
+        if commit_msg:
+            msg_file.write_text(commit_msg + "\n")
+            print(f"Commit message written to: {msg_file}", file=sys.stderr)
+        else:
+            msg_file.write_text("# No commit message generated\n")
+            print("Warning: Could not extract commit message", file=sys.stderr)
+
+        if diff:
+            diff_file.write_text(diff + "\n")
+            print(f"Diff written to: {diff_file}", file=sys.stderr)
+        else:
+            diff_file.write_text("# No changes suggested\n")
+            print("Warning: Could not extract diff", file=sys.stderr)
+
+        # Print full review
+        print(
+            f"\n=== Documentation Review: {doc_path.name} " f"(via {provider_name}) ==="
+        )
+        print(review_text)
+
+    elif args.output_format == "json":
+        # Try to parse JSON and extract diff/msg
+        try:
+            review_data = json.loads(review_text)
+            commit_msg = review_data.get("commit_message", "")
+            diff = review_data.get("diff", "")
+
+            if commit_msg:
+                msg_file.write_text(commit_msg + "\n")
+                print(f"Commit message written to: {msg_file}", file=sys.stderr)
+
+            if diff:
+                diff_file.write_text(diff + "\n")
+                print(f"Diff written to: {diff_file}", file=sys.stderr)
+
+        except json.JSONDecodeError:
+            print("Warning: Response is not valid JSON", file=sys.stderr)
+            review_data = {"raw_response": review_text}
+
+        # Add metadata
+        output_data = {
+            "metadata": {
+                "doc_file": args.doc_file,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+                "commit_prefix": commit_prefix,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+        print(output_text)
+
+    elif args.output_format == "markdown":
+        output_text = f"""# Documentation Review: {doc_path.name}
+
+*Reviewed by {provider_name} ({model})*
+
+{review_text}
+"""
+        print(output_text)
+
+    elif args.output_format == "html":
+        output_text = f"""<!-- Documentation review of {doc_path.name} -->
+<!-- Reviewed using {provider_name} ({model}) -->
+<div class="doc-review">
+<h1>Documentation Review: {doc_path.name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+{review_text}
+</div>
+"""
+        print(output_text)
+
+    # Write to output file if requested
+    if args.output_file:
+        if args.output_format == "json":
+            Path(args.output_file).write_text(output_text)
+        elif args.output_format in ("markdown", "html"):
+            Path(args.output_file).write_text(output_text)
+        else:
+            Path(args.output_file).write_text(review_text)
+        print(f"Full review written to: {args.output_file}", file=sys.stderr)
+
+    # Print usage instructions for text format
+    if args.output_format == "text":
+        print("\n=== Output Files ===")
+        print(f"Commit message: {msg_file}")
+        print(f"Diff file:      {diff_file}")
+        print("\nTo apply changes:")
+        print(f"  git apply {diff_file}")
+        print(f"  git commit -sF {msg_file}")
+
+    # Send email if requested
+    if args.send_email:
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        review_subject = f"[REVIEW] {commit_prefix} {doc_path.name}"
+
+        # Build email body
+        email_body = f"""AI-generated documentation review of {args.doc_file}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            None,
+            email_body,
+            args.dry_run,
+            args.verbose,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v7 1/4] doc: add AGENTS.md for AI-powered code review tools
  2026-01-26 18:40   ` [PATCH v7 1/4] doc: add AGENTS.md for AI-powered " Stephen Hemminger
@ 2026-01-30 23:49     ` Stephen Hemminger
  0 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-01-30 23:49 UTC (permalink / raw)
  To: dev

On Mon, 26 Jan 2026 10:40:18 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> Add a structured reference document that enables AI code review tools
> to validate DPDK contributions against project standards. This document
> consolidates requirements from multiple sources into a machine-readable
> format optimized for automated validation workflows.
> 
> The AGENTS.md file synthesizes guidelines from:
> - DPDK Contributing Code documentation (patches.rst)
> - DPDK Coding Style guidelines (coding_style.rst)
> - DPDK validation scripts (check-git-log.sh, checkpatches.sh)
> - Linux kernel patch submission process
> - SPDX License Identifier specification
> - DPDK Coccinelle scripts (cocci)
> - common items spotted on mailing list review
> 
> Key sections include:
> - SPDX license and copyright header requirements
> - Commit message format with precise limits (60 char subject,
>   75 char body) and tag ordering rules
> - C coding style including explicit comparison requirements
> - Forbidden tokens table derived from checkpatches.sh
> - API tag placement rules for experimental and internal APIs
> - Patch validation checklists with severity levels
> 
> The forbidden tokens section documents restrictions on deprecated
> atomics, logging functions, threading APIs, and compiler built-ins
> that are checked by the existing checkpatches.sh infrastructure.
> 
> Severity levels (error/warning/info) align with the exit codes and
> messaging from check-git-log.sh and checkpatches.sh to help automated
> tools prioritize feedback appropriately.
> 
> References:
> - https://doc.dpdk.org/guides/contributing/patches.html
> - https://doc.dpdk.org/guides/contributing/coding_style.html
> - devtools/check-git-log.sh
> - devtools/checkpatches.sh
> - devtools/cocci/
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---

Rather than one potentially huge file, I am thinking that breaking it
into bits and putting it a directory similar to
 https://github.com/masoncl/review-prompts/blob/main/README.md

Maybe review-tools/ directory at top level.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v8 0/6] add AGENTS.md and scripts for AI code review
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
                     ` (3 preceding siblings ...)
  2026-01-26 18:40   ` [PATCH v7 4/4] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-02-09 19:48   ` Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
                       ` (5 more replies)
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                     ` (4 subsequent siblings)
  9 siblings, 6 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-09 19:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a structured reference document that enables AI code review tools
to validate DPDK contributions against project standards. This document
consolidates requirements from multiple sources into a machine-readable
format optimized for automated validation workflows.

The AGENTS.md file synthesizes guidelines from:
- DPDK Contributing Code documentation (patches.rst)
- DPDK Coding Style guidelines (coding_style.rst)
- DPDK validation scripts (check-git-log.sh, checkpatches.sh)
- Linux kernel patch submission process
- SPDX License Identifier specification
- DPDK Coccinelle scripts (cocci)
- common items spotted on mailing list review

Key sections include:
- SPDX license and copyright header requirements
- Commit message format with precise limits (60 char subject,
  75 char body) and tag ordering rules
- C coding style including explicit comparison requirements
- Forbidden tokens table derived from checkpatches.sh
- API tag placement rules for experimental and internal APIs
- Patch validation checklists with severity levels

The forbidden tokens section documents restrictions on deprecated
atomics, logging functions, threading APIs, and compiler built-ins
that are checked by the existing checkpatches.sh infrastructure.

Severity levels (error/warning/info) align with the exit codes and
messaging from check-git-log.sh and checkpatches.sh to help automated
tools prioritize feedback appropriately.

References:
- https://doc.dpdk.org/guides/contributing/patches.html
- https://doc.dpdk.org/guides/contributing/coding_style.html
- devtools/check-git-log.sh
- devtools/checkpatches.sh
- devtools/cocci/

v8 - revisions to AGENTS.md to detect more bugs.
     previous prompt was screening out leaks where AI wasn't sure
     enough to report it.


Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Stephen Hemminger (6):
  doc: add AGENTS.md for AI code review tools
  devtools: add multi-provider AI patch review script
  devtools: add compare-reviews.sh for multi-provider analysis
  devtools: add multi-provider AI documentation review script
  doc: add AI-assisted patch review to contributing guide
  MAINTAINERS: add section for AI review tools

 AGENTS.md                              | 1514 ++++++++++++++++++++++++
 MAINTAINERS                            |    8 +
 devtools/analyze-patch.py              | 1334 +++++++++++++++++++++
 devtools/compare-reviews.sh            |  192 +++
 devtools/review-doc.py                 | 1098 +++++++++++++++++
 doc/guides/contributing/new_driver.rst |    2 +
 doc/guides/contributing/patches.rst    |   56 +
 7 files changed, 4204 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 devtools/analyze-patch.py
 create mode 100755 devtools/compare-reviews.sh
 create mode 100755 devtools/review-doc.py

-- 
2.51.0


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v8 1/6] doc: add AGENTS.md for AI code review tools
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-02-09 19:48     ` Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-09 19:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add a reference document that enables AI code review tools to
validate DPDK patches against project coding standards and
submission requirements.

The document consolidates guidelines from the DPDK contributor
documentation, coding style guide, and validation scripts
(check-git-log.sh, checkpatches.sh) into a single structured
file. It covers SPDX license requirements, commit message
formatting, C coding style rules, forbidden token restrictions,
API tag placement, and severity levels aligned with existing
validation tool output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 AGENTS.md | 1514 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1514 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..e14e4284f5
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,1514 @@
+# AGENTS.md - DPDK Code Review Guidelines for AI Tools
+
+## CRITICAL INSTRUCTION - READ FIRST
+
+This document has two categories of review rules with different
+confidence thresholds:
+
+### 1. Correctness Bugs -- HIGHEST PRIORITY (report at >=50% confidence)
+
+**Always report potential correctness bugs.** These are the most
+valuable findings. When in doubt, report them with a note about
+your confidence level. A possible use-after-free or resource leak
+is worth mentioning even if you are not certain.
+
+Correctness bugs include:
+- Use-after-free (accessing memory after `free`/`rte_free`)
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference
+- Buffer overflows or out-of-bounds access
+- Uninitialized variable use in a reachable code path
+- Race conditions (unsynchronized shared state)
+- Missing error checks on functions that can fail
+- Error paths that skip cleanup (goto labels, missing free/close)
+- Incorrect error propagation (wrong return value, lost errno)
+- Logic errors in conditionals (wrong operator, inverted test)
+- Integer overflow/truncation in size calculations
+- Missing bounds checks on user-supplied sizes or indices
+
+**Do NOT self-censor correctness bugs.** If you identify a code
+path where a resource could leak or memory could be used after
+free, report it. Do not talk yourself out of it.
+
+### 2. Style, Process, and Formatting -- suppress false positives
+
+**NEVER list a style/process item under "Errors" or "Warnings" if
+you conclude it is correct.**
+
+Before outputting any style, formatting, or process error/warning,
+verify it is actually wrong. If your analysis concludes with
+phrases like "there's no issue here", "which is fine", "appears
+correct", "is acceptable", or "this is actually correct" -- then
+DO NOT INCLUDE IT IN YOUR OUTPUT AT ALL. Delete it. Omit it
+entirely.
+
+This suppression rule applies to: SPDX/copyright format, commit
+message formatting, tag ordering, line length, naming conventions,
+code style, and process compliance. It does NOT apply to
+correctness bugs listed above.
+
+---
+
+This document provides guidelines for AI-powered code review tools
+when reviewing contributions to the Data Plane Development Kit
+(DPDK). It is derived from the official DPDK contributor guidelines
+and validation scripts.
+
+## Overview
+
+DPDK follows a development process modeled on the Linux Kernel. All
+patches are reviewed publicly on the mailing list before being
+merged. AI review tools should verify compliance with the standards
+outlined below.
+
+## Review Philosophy
+
+**Correctness bugs are the primary goal of AI review.** Style and
+formatting checks are secondary. A review that catches a
+use-after-free but misses a style nit is far more valuable than
+one that catches every style issue but misses the bug.
+
+**BEFORE OUTPUTTING YOUR REVIEW**: Re-read each item.
+- For correctness bugs: keep them. If you have reasonable doubt
+  that a code path is safe, report it.
+- For style/process items: if ANY item contains phrases like "is
+  fine", "no issue", "appears correct", "is acceptable",
+  "actually correct" -- DELETE THAT ITEM. Do not include it.
+
+### Correctness review guidelines
+- Trace error paths: for every function that allocates a resource
+  or acquires a lock, verify that ALL error paths after that point
+  release it
+- Check every `goto error` and early `return`: does it clean up
+  everything allocated so far?
+- Look for use-after-free: after `free(p)`, is `p` accessed again?
+- Check that error codes are propagated, not silently dropped
+- Report at >=50% confidence; note uncertainty if appropriate
+- It is better to report a potential bug that turns out to be safe
+  than to miss a real bug
+
+### Style and process review guidelines
+- Only comment on style/process issues when you have HIGH CONFIDENCE (>80%) that an issue exists
+- Be concise: one sentence per comment when possible
+- Focus on actionable feedback, not observations
+- When reviewing text, only comment on clarity issues if the text is genuinely
+  confusing or could lead to errors.
+- Do NOT comment on copyright years unless outside valid range (2013 to current year)
+- Do NOT report an issue then contradict yourself - if something is acceptable, do not mention it at all
+- Do NOT include items in Errors/Warnings that you then say are "acceptable" or "correct"
+- Do NOT mention things that are correct or "not an issue" - only report actual problems
+- Do NOT speculate about contributor circumstances (employment, company policies, etc.)
+- Before adding any style item to your review, ask: "Is this actually wrong?" If no, omit it entirely.
+- VERIFY before reporting: For subject line length, COUNT the characters first. If <=60, do not mention it.
+- NEVER write "(Correction: ...)" - if you need to correct yourself, simply omit the item entirely
+- Do NOT add vague suggestions like "should be verified" or "should be checked" - either it's wrong or don't mention it
+- Do NOT flag something as an Error then say "which is correct" in the same item
+- Do NOT say "no issue here" or "this is actually correct" - if there's no issue, do not include it in your review
+- Do NOT call the standard DPDK SPDX/copyright format "different style" - it is THE standard
+- Do NOT analyze cross-patch dependencies or compilation order - you cannot reliably determine this from patch review
+- Do NOT claim a patch "would cause compilation failure" based on symbols used in other patches in the series
+- Review each patch individually for its own correctness; assume the patch author ordered them correctly
+
+## Priority Areas (Review These)
+
+### Security & Safety
+- Unsafe code blocks without justification
+- Command injection risks (shell commands, user input)
+- Path traversal vulnerabilities
+- Credential exposure or hard coded secrets
+- Missing input validation on external data
+- Improper error handling that could leak sensitive info
+
+### Correctness Issues
+- Logic errors that could cause panics or incorrect behavior
+- Buffer overflows
+- Race conditions
+- Resource leaks (files, connections, memory)
+- Off-by-one errors or boundary conditions
+- Incorrect error propagation
+- **Use-after-free** (any access to memory after it has been freed)
+- **Error path resource leaks**: For every allocation or fd open,
+  trace each error path (`goto`, early `return`, conditional) to
+  verify the resource is released. Common patterns to check:
+  - `malloc`/`rte_malloc` followed by a failure that does `return -1`
+    instead of `goto cleanup`
+  - `open()`/`socket()` fd not closed on a later error
+  - Lock acquired but not released on an error branch
+  - Partially initialized structure where early fields are allocated
+    but later allocation fails without freeing the early ones
+- **Double-free / double-close**: resource freed in both a normal
+  path and an error path, or fd closed but not set to -1 allowing
+  a second close
+- **Missing error checks**: functions that can fail (malloc, open,
+  ioctl, etc.) whose return value is not checked
+- Changes to API without release notes
+- Changes to ABI on non-LTS release
+- Usage of deprecated APIs when replacements exist
+- Overly defensive code that adds unnecessary checks
+- Unnecessary comments that just restate what the code already shows (remove them)
+- **Process-shared synchronization errors** (pthread mutexes in shared memory without `PTHREAD_PROCESS_SHARED`)
+
+### Architecture & Patterns
+- Code that violates existing patterns in the code base
+- Missing error handling
+- Code that is not safe against signals
+
+---
+
+## Source License Requirements
+
+### SPDX License Identifiers
+
+Every source file must begin with an SPDX license identifier, followed
+by the copyright notice, then a blank line before other content.
+
+- SPDX tag on first line (or second line for `#!` scripts)
+- Copyright line immediately follows
+- Blank line after copyright before any code/includes
+- Core libraries and drivers use `BSD-3-Clause`
+- Kernel components use `GPL-2.0`
+- Dual-licensed code uses: `(BSD-3-Clause OR GPL-2.0)`
+
+```c
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 John Smith
+ */
+
+#include <stdio.h>
+```
+
+For scripts:
+```python
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Jane Doe
+
+import sys
+```
+
+**Do not include boilerplate license text** - the SPDX identifier is sufficient.
+
+**Do NOT flag copyright years** - Copyright years reflect when code was written. Valid years range from 2013 (DPDK's first release) through the current year. Only flag years outside this range (e.g., years before 2013 or future years beyond the current date).
+
+**Copyright holders can be individuals or organizations** - Both are equally valid. NEVER comment on, question, or speculate about copyright holders. Do not mention employer policies, company resources, or suggest copyright "should" be assigned differently. The copyright holder's choice is not subject to review.
+
+**The following SPDX/copyright format is correct** - do NOT flag it or comment on it:
+```c
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2026 Stephen Hemminger
+ */
+```
+This is the standard DPDK format. Do not say it is "different" or "unusual" - it is correct.
+Do not suggest the copyright year "should be verified" - if it's in the valid range (2013-current year), it's fine.
+If SPDX is on line 1 and copyright follows, this is CORRECT - do not include it in Errors.
+
+---
+
+## Commit Message Requirements
+
+### Subject Line (First Line)
+
+| Rule | Limit |
+|------|-------|
+| Maximum length | **60 characters** |
+| Format | `component: lowercase description` |
+| Case | Lowercase except acronyms |
+| Mood | Imperative (instructions to codebase) |
+| Punctuation | **No trailing period** |
+
+**Before flagging subject line length**: Actually count the characters. Only flag if >60. Do not flag then correct yourself.
+
+```
+# Good examples
+net/ixgbe: fix offload config option name
+config: increase max queues per port
+net/mlx5: add support for flow counters
+app/testpmd: fix memory leak in flow create
+
+# Bad examples
+Fixed the offload config option.    # past tense, has period, no prefix
+net/ixgbe: Fix Offload Config       # uppercase after colon
+ixgbe: fix something                # wrong prefix, should be net/ixgbe
+lib/ethdev: add new feature         # wrong prefix, should be ethdev:
+```
+
+#### Headline Format Errors (from check-git-log.sh)
+
+The following are flagged as errors:
+- Tab characters in subject
+- Leading or trailing spaces
+- Trailing period (`.`)
+- Punctuation marks: `, ; ! ? & |`
+- Underscores after the colon (indicates code in subject)
+- Missing colon separator
+- No space after colon
+- Space before colon
+
+#### Common Prefix Mistakes
+
+| Wrong | Correct |
+|-------|---------|
+| `ixgbe:` | `net/ixgbe:` |
+| `lib/ethdev:` | `ethdev:` |
+| `example:` | `examples/foo:` |
+| `apps/` | `app/name:` |
+| `app/test:` | `test:` |
+| `testpmd:` | `app/testpmd:` |
+| `test-pmd:` | `app/testpmd:` |
+| `bond:` | `net/bonding:` |
+
+#### Case-Sensitive Terms (Commit Messages Only)
+
+These terms must use exact capitalization **in commit messages** (from `devtools/words-case.txt`):
+- `Rx`, `Tx` (not `RX`, `TX`, `rx`, `tx`)
+- `VF`, `PF` (not `vf`, `pf`)
+- `MAC`, `VLAN`, `RSS`, `API`
+- `Linux`, `Windows`, `FreeBSD`
+- Check `devtools/words-case.txt` for complete list
+
+**Note**: These rules apply to commit messages only, NOT to code comments or documentation.
+
+### Commit Body
+
+| Rule | Limit |
+|------|-------|
+| Line wrap | **75 characters** |
+| Exception | `Fixes:` lines may exceed 75 chars |
+
+Body guidelines:
+- Describe the issue being fixed or feature being added
+- Provide enough context for reviewers
+- **Do not start the commit message body with "It"**
+- **Must end with** `Signed-off-by:` line (real name, not alias)
+
+### Fixes Tag
+
+When fixing regressions, use the `Fixes:` tag with a 12-character abbreviated SHA:
+
+```
+Fixes: abcdefgh1234 ("original commit subject")
+```
+
+The hash must reference a commit in the current branch, and the subject must match exactly.
+
+**Do NOT flag Fixes tags** asking for verification that the commit "exists in the tree" or "cannot verify" - you cannot verify this from a patch review. If the format is correct (12-char SHA, quoted subject), accept it. NEVER say "cannot verify this exists".
+
+**Finding maintainers**: Use `devtools/get-maintainer.sh` to identify the current subsystem maintainer from the `MAINTAINERS` file, rather than CC'ing the original author:
+
+```bash
+git send-email --to-cmd ./devtools/get-maintainer.sh --cc dev@dpdk.org 000*.patch
+```
+
+### Required Tags
+
+```
+# For Coverity issues (required if "coverity" mentioned in body):
+Coverity issue: 12345
+
+# For Bugzilla issues (required if "bugzilla" mentioned in body):
+Bugzilla ID: 12345
+
+# For stable release backport candidates:
+Cc: stable@dpdk.org
+
+# For patch dependencies (in commit notes after ---):
+Depends-on: series-NNNNN ("Title of the series")
+```
+
+### Tag Order
+
+Tags must appear in this order, with a blank line separating the two groups:
+
+**Group 1** (optional tags, no blank lines within this group):
+- Coverity issue:
+- Bugzilla ID:
+- Fixes:
+- Cc:
+
+**Blank line required here** (only if Group 1 tags are present)
+
+**Group 2** (no blank lines within this group):
+- Reported-by:
+- Suggested-by:
+- Signed-off-by:
+- Acked-by:
+- Reviewed-by:
+- Tested-by:
+
+**Correct examples:**
+
+Simple patch with no Group 1 tags (most common):
+```
+The info_get callback doesn't need to check its args
+since already done by ethdev.
+
+Signed-off-by: John Smith <john@example.com>
+```
+
+Patch with Fixes and Cc tags:
+```
+Fixes: c743e50c475f ("null: new poll mode driver")
+Cc: stable@dpdk.org
+
+Signed-off-by: John Smith <john@example.com>
+```
+
+Patch with only Fixes tag:
+```
+Fixes: abcd1234abcd ("component: original commit")
+
+Signed-off-by: Jane Doe <jane@example.com>
+```
+
+**What is correct (do NOT flag):**
+- Signed-off-by directly after commit body when there are no Group 1 tags - this is CORRECT
+- No blank line between `Fixes:` and `Cc:` - this is CORRECT
+- Blank line between `Cc:` (or last tag in Group 1) and `Signed-off-by:` - this is CORRECT
+
+**What is wrong (DO flag):**
+- Missing blank line between Group 1 tags and Group 2 tags (when Group 1 tags exist)
+
+**Tag format**: `Tag-name: Full Name <email@domain.com>`
+
+---
+
+## C Coding Style
+
+### Line Length
+
+| Context | Limit |
+|---------|-------|
+| Source code | **100 characters** |
+| Commit body | **75 characters** |
+
+### General Formatting
+
+- **Tab width**: 8 characters (hard tabs for indentation, spaces for alignment)
+- **No trailing whitespace** on lines or at end of files
+- Files must end with a new line
+- Code style should be consistent within each file
+
+
+### Comments
+
+```c
+/* Most single-line comments look like this. */
+
+/*
+ * VERY important single-line comments look like this.
+ */
+
+/*
+ * Multi-line comments look like this. Make them real sentences. Fill
+ * them so they look like real paragraphs.
+ */
+```
+
+### Header File Organization
+
+Include order (each group separated by blank line):
+1. System/libc includes
+2. DPDK EAL includes
+3. DPDK misc library includes
+4. Application-specific includes
+
+```c
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <rte_eal.h>
+
+#include <rte_ring.h>
+#include <rte_mempool.h>
+
+#include "application.h"
+```
+
+### Header Guards
+
+```c
+#ifndef _FILE_H_
+#define _FILE_H_
+
+/* Code */
+
+#endif /* _FILE_H_ */
+```
+
+### Naming Conventions
+
+- **All external symbols** must have `RTE_` or `rte_` prefix
+- **Macros**: ALL_UPPERCASE with `RTE_` prefix
+- **Functions**: lowercase with underscores only (no CamelCase)
+- **Variables**: lowercase with underscores only
+- **Enum values**: ALL_UPPERCASE with `RTE_<ENUM>_` prefix
+
+**Exception**: Driver base directories (`drivers/*/base/`) may use different
+naming conventions when sharing code across platforms or with upstream vendor code.
+
+#### Symbol Naming for Static Linking
+
+Drivers and libraries must not expose global variables that could
+clash when statically linked with other DPDK components or
+applications. Use consistent and unique prefixes for all exported
+symbols to avoid namespace collisions.
+
+**Good practice**: Use a driver-specific or library-specific prefix for all global variables:
+
+```c
+/* Good - virtio driver uses consistent "virtio_" prefix */
+const struct virtio_ops virtio_legacy_ops = {
+	.read = virtio_legacy_read,
+	.write = virtio_legacy_write,
+	.configure = virtio_legacy_configure,
+};
+
+const struct virtio_ops virtio_modern_ops = {
+	.read = virtio_modern_read,
+	.write = virtio_modern_write,
+	.configure = virtio_modern_configure,
+};
+
+/* Good - mlx5 driver uses consistent "mlx5_" prefix */
+struct mlx5_flow_driver_ops mlx5_flow_dv_ops;
+```
+
+**Bad practice**: Generic names that may clash:
+
+```c
+/* Bad - "ops" is too generic, will clash with other drivers */
+const struct virtio_ops ops = { ... };
+
+/* Bad - "legacy_ops" could clash with other legacy implementations */
+const struct virtio_ops legacy_ops = { ... };
+
+/* Bad - "driver_config" is not unique */
+struct driver_config config;
+```
+
+**Guidelines**:
+- Prefix all global variables with the driver or library name (e.g., `virtio_`, `mlx5_`, `ixgbe_`)
+- Prefix all global functions similarly unless they use the `rte_` namespace
+- Internal static variables do not require prefixes as they have file scope
+- Consider using the `RTE_` or `rte_` prefix only for symbols that are part of the public DPDK API
+
+#### Prohibited Terminology
+
+Do not use non-inclusive naming including:
+- `master/slave` -> Use: primary/secondary, controller/worker, leader/follower
+- `blacklist/whitelist` -> Use: denylist/allowlist, blocklist/passlist
+- `cripple` -> Use: impacted, degraded, restricted, immobolized
+- `tribe` -> Use: team, squad
+- `sanity check` -> Use: coherence check, test, verification
+
+
+### Comparisons and Boolean Logic
+
+```c
+/* Pointers - compare explicitly with NULL */
+if (p == NULL)      /* Good */
+if (p != NULL)      /* Good */
+if (likely(p != NULL))   /* Good - likely/unlikely don't change this */
+if (unlikely(p == NULL)) /* Good - likely/unlikely don't change this */
+if (!p)             /* Bad - don't use ! on pointers */
+
+/* Integers - compare explicitly with zero */
+if (a == 0)         /* Good */
+if (a != 0)         /* Good */
+if (errno != 0)     /* Good - this IS explicit */
+if (likely(a != 0)) /* Good - likely/unlikely don't change this */
+if (!a)             /* Bad - don't use ! on integers */
+if (a)              /* Bad - implicit, should be a != 0 */
+
+/* Characters - compare with character constant */
+if (*p == '\0')     /* Good */
+
+/* Booleans - direct test is acceptable */
+if (flag)           /* Good for actual bool types */
+if (!flag)          /* Good for actual bool types */
+```
+
+**Explicit comparison** means using `==` or `!=` operators (e.g., `x != 0`, `p == NULL`).
+**Implicit comparison** means relying on truthiness without an operator (e.g., `if (x)`, `if (!p)`).
+**Note**: `likely()` and `unlikely()` macros do NOT affect whether a comparison is explicit or implicit.
+
+### Boolean Usage
+
+- Using `bool` type is allowed
+- Prefer `bool` over `int` when a variable or field is only used as a boolean
+- For structure fields, consider if the size/alignment impact is acceptable
+
+### Indentation and Braces
+
+```c
+/* Control statements - no braces for single statements */
+if (val != NULL)
+	val = realloc(val, newsize);
+
+/* Braces on same line as else */
+if (test)
+	stmt;
+else if (bar) {
+	stmt;
+	stmt;
+} else
+	stmt;
+
+/* Switch statements - don't indent case */
+switch (ch) {
+case 'a':
+	aflag = 1;
+	/* FALLTHROUGH */
+case 'b':
+	bflag = 1;
+	break;
+default:
+	usage();
+}
+
+/* Long conditions - double indent continuation */
+if (really_long_variable_name_1 == really_long_variable_name_2 &&
+		really_long_variable_name_3 == really_long_variable_name_4)
+	stmt;
+```
+
+### Variable Declarations
+
+- Prefer declaring variables inside the basic block where they are used
+- Variables may be declared either at the start of the block, or at point of first use (C99 style)
+- Both declaration styles are acceptable; consistency within a function is preferred
+- Initialize variables only when a meaningful value exists at declaration time
+- Use C99 designated initializers for structures
+
+```c
+/* Good - declaration at start of block */
+int ret;
+ret = some_function();
+
+/* Also good - declaration at point of use (C99 style) */
+for (int i = 0; i < count; i++)
+	process(i);
+
+/* Good - declaration in inner block where variable is used */
+if (condition) {
+	int local_val = compute();
+	use(local_val);
+}
+
+/* Bad - unnecessary initialization defeats compiler warnings */
+int ret = 0;
+ret = some_function();    /* Compiler won't warn if assignment removed */
+```
+
+### Function Format
+
+- Return type on its own line
+- Opening brace on its own line
+- Place an empty line between declarations and statements
+
+```c
+static char *
+function(int a1, int b1)
+{
+	char *p;
+
+	p = do_something(a1, b1);
+	return p;
+}
+```
+
+---
+
+## Unnecessary Code Patterns
+
+The following patterns add unnecessary code, hide bugs, or reduce performance. Avoid them.
+
+### Unnecessary Variable Initialization
+
+Do not initialize variables that will be assigned before use. This defeats the compiler's uninitialized variable warnings, hiding potential bugs.
+
+```c
+/* Bad - initialization defeats -Wuninitialized */
+int ret = 0;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - compiler will warn if any path misses assignment */
+int ret;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - meaningful initial value */
+int count = 0;
+for (i = 0; i < n; i++)
+	if (test(i))
+		count++;
+```
+
+### Unnecessary Casts of void *
+
+In C, `void *` converts implicitly to any pointer type. Casting the result of `malloc()`, `calloc()`, `rte_malloc()`, or similar functions is unnecessary and can hide the error of a missing `#include <stdlib.h>`.
+
+```c
+/* Bad - unnecessary cast */
+struct foo *p = (struct foo *)malloc(sizeof(*p));
+struct bar *q = (struct bar *)rte_malloc(NULL, sizeof(*q), 0);
+
+/* Good - no cast needed in C */
+struct foo *p = malloc(sizeof(*p));
+struct bar *q = rte_malloc(NULL, sizeof(*q), 0);
+```
+
+Note: Casts are required in C++ but DPDK is a C project.
+
+### Zero-Length Arrays vs Variable-Length Arrays
+
+Zero-length arrays (`int arr[0]`) are a GCC extension. Use C99 flexible array members instead.
+
+```c
+/* Bad - GCC extension */
+struct msg {
+	int len;
+	char data[0];
+};
+
+/* Good - C99 flexible array member */
+struct msg {
+	int len;
+	char data[];
+};
+```
+
+### Unnecessary NULL Checks Before free()
+
+Functions like `free()`, `rte_free()`, and similar deallocation functions accept NULL pointers safely. Do not add redundant NULL checks.
+
+```c
+/* Bad - unnecessary check */
+if (ptr != NULL)
+	free(ptr);
+
+if (rte_ptr != NULL)
+	rte_free(rte_ptr);
+
+/* Good - free handles NULL */
+free(ptr);
+rte_free(rte_ptr);
+```
+
+### memset Before free()
+
+Do not call `memset()` to zero memory before freeing it. The compiler may optimize away the `memset()` as a dead store. For security-sensitive data, use `rte_free_sensitive()` which ensures memory is cleared.
+
+```c
+/* Bad - compiler may eliminate memset */
+memset(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for non-sensitive data, just free */
+free(ptr);
+
+/* Good - for sensitive data, use secure free */
+rte_free_sensitive(secret_key);
+```
+
+### Appropriate Use of rte_malloc()
+
+`rte_malloc()` allocates from hugepage memory. Use it only when required:
+
+- Memory that will be accessed by DMA (NIC descriptors, packet buffers)
+- Memory shared between primary and secondary DPDK processes
+- Memory requiring specific NUMA node placement
+
+For general allocations, use standard `malloc()` which is faster and does not consume limited hugepage resources.
+
+```c
+/* Bad - rte_malloc for ordinary data structure */
+struct config *cfg = rte_malloc(NULL, sizeof(*cfg), 0);
+
+/* Good - standard malloc for control structures */
+struct config *cfg = malloc(sizeof(*cfg));
+
+/* Good - rte_malloc for DMA-accessible memory */
+struct rte_mbuf *mbufs = rte_malloc(NULL, n * sizeof(*mbufs), RTE_CACHE_LINE_SIZE);
+```
+
+### Appropriate Use of rte_memcpy()
+
+`rte_memcpy()` is optimized for bulk data transfer in the fast path. For general use, standard `memcpy()` is preferred because:
+
+- Modern compilers optimize `memcpy()` effectively
+- `memcpy()` includes bounds checking with `_FORTIFY_SOURCE`
+- `memcpy()` handles small fixed-size copies efficiently
+
+```c
+/* Bad - rte_memcpy in control path */
+rte_memcpy(&config, &default_config, sizeof(config));
+
+/* Good - standard memcpy for control path */
+memcpy(&config, &default_config, sizeof(config));
+
+/* Good - rte_memcpy for packet data in fast path */
+rte_memcpy(rte_pktmbuf_mtod(m, void *), payload, len);
+```
+
+---
+
+## Forbidden Tokens
+
+### Functions
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `rte_panic()` | Return error codes | lib/, drivers/ |
+| `rte_exit()` | Return error codes | lib/, drivers/ |
+| `perror()` | `RTE_LOG()` with `strerror(errno)` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `printf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `fprintf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+
+### Atomics and Memory Barriers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `rte_atomic16/32/64_xxx()` | C11 atomics via `rte_atomic_xxx()` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence()` |
+| `__sync_xxx()` | `rte_atomic_xxx()` |
+| `__atomic_xxx()` | `rte_atomic_xxx()` |
+| `__ATOMIC_RELAXED` etc. | `rte_memory_order_xxx` |
+| `__rte_atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+
+### Threading
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `pthread_create()` | `rte_thread_create()` |
+| `pthread_join()` | `rte_thread_join()` |
+| `pthread_detach()` | EAL thread functions |
+| `pthread_setaffinity_np()` | `rte_thread_set_affinity()` |
+| `rte_thread_set_name()` | `rte_thread_set_prefixed_name()` |
+| `rte_thread_create_control()` | `rte_thread_create_internal_control()` |
+
+### Process-Shared Synchronization
+
+When placing synchronization primitives in shared memory (memory accessible by multiple processes, such as DPDK primary/secondary processes or `mmap`'d regions), they **must** be initialized with process-shared attributes. Failure to do so causes **undefined behavior** that may appear to work in testing but fail unpredictably in production.
+
+#### pthread Mutexes in Shared Memory
+
+**This is an error** - mutex in shared memory without `PTHREAD_PROCESS_SHARED`:
+
+```c
+/* BAD - undefined behavior when used across processes */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+void init_shared(struct shared_data *shm) {
+	pthread_mutex_init(&shm->lock, NULL);  /* ERROR: missing pshared attribute */
+}
+```
+
+**Correct implementation**:
+
+```c
+/* GOOD - properly initialized for cross-process use */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+int init_shared(struct shared_data *shm) {
+	pthread_mutexattr_t attr;
+	int ret;
+
+	ret = pthread_mutexattr_init(&attr);
+	if (ret != 0)
+		return -ret;
+
+	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+	if (ret != 0) {
+		pthread_mutexattr_destroy(&attr);
+		return -ret;
+	}
+
+	ret = pthread_mutex_init(&shm->lock, &attr);
+	pthread_mutexattr_destroy(&attr);
+
+	return -ret;
+}
+```
+
+#### pthread Condition Variables in Shared Memory
+
+Condition variables also require the process-shared attribute:
+
+```c
+/* BAD - will not work correctly across processes */
+pthread_cond_init(&shm->cond, NULL);
+
+/* GOOD */
+pthread_condattr_t cattr;
+pthread_condattr_init(&cattr);
+pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+pthread_cond_init(&shm->cond, &cattr);
+pthread_condattr_destroy(&cattr);
+```
+
+#### pthread Read-Write Locks in Shared Memory
+
+```c
+/* BAD */
+pthread_rwlock_init(&shm->rwlock, NULL);
+
+/* GOOD */
+pthread_rwlockattr_t rwattr;
+pthread_rwlockattr_init(&rwattr);
+pthread_rwlockattr_setpshared(&rwattr, PTHREAD_PROCESS_SHARED);
+pthread_rwlock_init(&shm->rwlock, &rwattr);
+pthread_rwlockattr_destroy(&rwattr);
+```
+
+#### When to Flag This Issue
+
+Flag as an **Error** when ALL of the following are true:
+1. A `pthread_mutex_t`, `pthread_cond_t`, `pthread_rwlock_t`, or `pthread_barrier_t` is initialized
+2. The primitive is stored in shared memory (identified by context such as: structure in `rte_malloc`/`rte_memzone`, `mmap`'d memory, memory passed to secondary processes, or structures documented as shared)
+3. The initialization uses `NULL` attributes or attributes without `PTHREAD_PROCESS_SHARED`
+
+**Do NOT flag** when:
+- The mutex is in thread-local or process-private heap memory (`malloc`)
+- The mutex is a local/static variable not in shared memory
+- The code already uses `pthread_mutexattr_setpshared()` with `PTHREAD_PROCESS_SHARED`
+- The synchronization uses DPDK primitives (`rte_spinlock_t`, `rte_rwlock_t`) which are designed for shared memory
+
+#### Preferred Alternatives
+
+For DPDK code, prefer DPDK's own synchronization primitives which are designed for shared memory:
+
+| pthread Primitive | DPDK Alternative |
+|-------------------|------------------|
+| `pthread_mutex_t` | `rte_spinlock_t` (busy-wait) or properly initialized pthread mutex |
+| `pthread_rwlock_t` | `rte_rwlock_t` |
+| `pthread_spinlock_t` | `rte_spinlock_t` |
+
+Note: `rte_spinlock_t` and `rte_rwlock_t` work correctly in shared memory without special initialization, but they are spinning locks unsuitable for long wait times.
+
+### Compiler Built-ins and Attributes
+
+| Forbidden | Preferred | Notes |
+|-----------|-----------|-------|
+| `__attribute__` | RTE macros in `rte_common.h` | Except in `lib/eal/include/rte_common.h` |
+| `__alignof__` | C11 `alignof` | |
+| `__typeof__` | `typeof` | |
+| `__builtin_*` | EAL macros | Except in `lib/eal/` and `drivers/*/base/` |
+| `__reserved` | Different name | Reserved in Windows headers |
+| `#pragma` / `_Pragma` | Avoid | Except in `rte_common.h` |
+
+### Format Specifiers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `%lld`, `%llu`, `%llx` | `%PRId64`, `%PRIu64`, `%PRIx64` |
+
+### Headers and Build
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `#include <linux/pci_regs.h>` | `#include <rte_pci.h>` | |
+| `install_headers()` | Meson `headers` variable | meson.build |
+| `-DALLOW_EXPERIMENTAL_API` | Not in lib/drivers/app | Build flags |
+| `allow_experimental_apis` | Not in lib/drivers/app | Meson |
+| `#undef XXX` | `// XXX is not set` | config/rte_config.h |
+| Driver headers (`*_driver.h`, `*_pmd.h`) | Public API headers | app/, examples/ |
+
+### Testing
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `REGISTER_TEST_COMMAND` | `REGISTER_<suite_name>_TEST` |
+
+### Documentation
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `http://...dpdk.org` | `https://...dpdk.org` |
+| `//doc.dpdk.org/guides/...` | `:ref:` or `:doc:` Sphinx references |
+| `::  file.svg` | `::  file.*` (wildcard extension) |
+
+---
+
+## Deprecated API Usage
+
+New patches must not introduce usage of deprecated APIs, macros, or functions.
+Deprecated items are marked with `RTE_DEPRECATED` or documented in the
+deprecation notices section of the release notes.
+
+### Rules for New Code
+
+- Do not call functions marked with `RTE_DEPRECATED` or `__rte_deprecated`
+- Do not use macros that have been superseded by newer alternatives
+- Do not use data structures or enum values marked as deprecated
+- Check `doc/guides/rel_notes/deprecation.rst` for planned deprecations
+- When a deprecated API has a replacement, use the replacement
+
+### Deprecating APIs
+
+A patch may mark an API as deprecated provided:
+
+- No remaining usages exist in the current DPDK codebase
+- The deprecation is documented in the release notes
+- A migration path or replacement API is documented
+- The `RTE_DEPRECATED` macro is used to generate compiler warnings
+
+```c
+/* Marking a function as deprecated */
+__rte_deprecated
+int
+rte_old_function(void);
+
+/* With a message pointing to the replacement */
+__rte_deprecated_msg("use rte_new_function() instead")
+int
+rte_old_function(void);
+```
+
+### Common Deprecated Patterns
+
+| Deprecated | Replacement | Notes |
+|-----------|-------------|-------|
+| `rte_atomic*_t` types | C11 atomics | Use `rte_atomic_xxx()` wrappers |
+| `rte_smp_*mb()` barriers | `rte_atomic_thread_fence()` | See Atomics section |
+| `pthread_*()` in portable code | `rte_thread_*()` | See Threading section |
+
+When reviewing patches that add new code, flag any usage of deprecated APIs
+as requiring change to use the modern replacement.
+
+---
+
+## API Tag Requirements
+
+### `__rte_experimental`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_experimental
+int
+rte_new_feature(void);
+
+/* Wrong - not alone on line */
+__rte_experimental int rte_new_feature(void);
+
+/* Wrong - in .c file */
+```
+
+### `__rte_internal`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_internal
+int
+internal_function(void);
+```
+
+### Alignment Attributes
+
+`__rte_aligned`, `__rte_cache_aligned`, `__rte_cache_min_aligned` may only be used with `struct` or `union` types:
+
+```c
+/* Correct */
+struct __rte_cache_aligned my_struct {
+	/* ... */
+};
+
+/* Wrong */
+int __rte_cache_aligned my_variable;
+```
+
+### Packed Attributes
+
+- `__rte_packed_begin` must follow `struct`, `union`, or alignment attributes
+- `__rte_packed_begin` and `__rte_packed_end` must be used in pairs
+- Cannot use `__rte_packed_begin` with `enum`
+
+```c
+/* Correct */
+struct __rte_packed_begin my_packed_struct {
+	/* ... */
+} __rte_packed_end;
+
+/* Wrong - with enum */
+enum __rte_packed_begin my_enum {
+	/* ... */
+};
+```
+
+---
+
+## Code Quality Requirements
+
+### Compilation
+
+- Each commit must compile independently (for `git bisect`)
+- No forward dependencies within a patchset
+- Test with multiple targets, compilers, and options
+- Use `devtools/test-meson-builds.sh`
+
+**Note for AI reviewers**: You cannot verify compilation order or cross-patch dependencies from patch review alone. Do NOT flag patches claiming they "would fail to compile" based on symbols used in other patches in the series. Assume the patch author has ordered them correctly.
+
+### Testing
+
+- Add tests to `app/test` unit test framework
+- New API functions must be used in `/app` test directory
+- New device APIs require at least one driver implementation
+
+#### Functional Test Infrastructure
+
+Standalone functional tests should use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure for consistency and proper integration with the DPDK test framework.
+
+```c
+#include <rte_test.h>
+
+static int
+test_feature_basic(void)
+{
+	int ret;
+
+	ret = rte_feature_init();
+	TEST_ASSERT_SUCCESS(ret, "Failed to initialize feature");
+
+	ret = rte_feature_operation();
+	TEST_ASSERT_EQUAL(ret, 0, "Operation returned unexpected value");
+
+	TEST_ASSERT_NOT_NULL(rte_feature_get_ptr(),
+		"Feature pointer should not be NULL");
+
+	return TEST_SUCCESS;
+}
+
+static struct unit_test_suite feature_testsuite = {
+	.suite_name = "feature_autotest",
+	.setup = test_feature_setup,
+	.teardown = test_feature_teardown,
+	.unit_test_cases = {
+		TEST_CASE(test_feature_basic),
+		TEST_CASE(test_feature_advanced),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_feature(void)
+{
+	return unit_test_suite_runner(&feature_testsuite);
+}
+
+REGISTER_FAST_TEST(feature_autotest, NOHUGE_OK, ASAN_OK, test_feature);
+```
+
+The `REGISTER_FAST_TEST` macro parameters are:
+- Test name (e.g., `feature_autotest`)
+- `NOHUGE_OK` or `HUGEPAGES_REQUIRED` - whether test can run without hugepages
+- `ASAN_OK` or `ASAN_FAILS` - whether test is compatible with Address Sanitizer
+- Test function name
+
+Common `TEST_ASSERT` macros:
+- `TEST_ASSERT(cond, msg, ...)` - Assert condition is true
+- `TEST_ASSERT_SUCCESS(val, msg, ...)` - Assert value equals 0
+- `TEST_ASSERT_FAIL(val, msg, ...)` - Assert value is non-zero
+- `TEST_ASSERT_EQUAL(a, b, msg, ...)` - Assert two values are equal
+- `TEST_ASSERT_NOT_EQUAL(a, b, msg, ...)` - Assert two values differ
+- `TEST_ASSERT_NULL(val, msg, ...)` - Assert value is NULL
+- `TEST_ASSERT_NOT_NULL(val, msg, ...)` - Assert value is not NULL
+
+### Documentation
+
+- Add Doxygen comments for public APIs
+- Update release notes in `doc/guides/rel_notes/` for important changes
+- Code and documentation must be updated atomically in same patch
+- Only update the **current release** notes file
+- Documentation must match the code
+- PMD features must match the features matrix in `doc/guides/nics/features/`
+- Documentation must match device operations (see `doc/guides/nics/features.rst` for the mapping between features, `eth_dev_ops`, and related APIs)
+- Release notes are NOT required for:
+  - Test-only changes (unit tests, functional tests)
+  - Internal APIs and helper functions (not exported to applications)
+  - Internal implementation changes that don't affect public API
+
+### API and Driver Changes
+
+- New APIs must be marked as `__rte_experimental`
+- New APIs must have hooks in `app/testpmd` and tests in the functional test suite
+- Changes to existing APIs require release notes
+- New drivers or subsystems must have release notes
+- Internal APIs (used only within DPDK, not exported to applications) do NOT require release notes
+
+### ABI Compatibility and Symbol Exports
+
+**IMPORTANT**: DPDK uses automatic symbol map generation. Do **NOT** recommend
+manually editing `version.map` files - they are auto-generated from source code
+annotations.
+
+#### Symbol Export Macros
+
+New public functions must be annotated with export macros (defined in
+`rte_export.h`). Place the macro on the line immediately before the function
+definition in the `.c` file:
+
+```c
+/* For stable ABI symbols */
+RTE_EXPORT_SYMBOL(rte_foo_create)
+int
+rte_foo_create(struct rte_foo_config *config)
+{
+    /* ... */
+}
+
+/* For experimental symbols (include version when first added) */
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_foo_new_feature, 25.03)
+__rte_experimental
+int
+rte_foo_new_feature(void)
+{
+    /* ... */
+}
+
+/* For internal symbols (shared between DPDK components only) */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_foo_internal_helper)
+int
+rte_foo_internal_helper(void)
+{
+    /* ... */
+}
+```
+
+#### Symbol Export Rules
+
+- `RTE_EXPORT_SYMBOL` - Use for stable ABI functions
+- `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, ver)` - Use for new experimental APIs
+  (version is the DPDK release, e.g., `25.03`)
+- `RTE_EXPORT_INTERNAL_SYMBOL` - Use for functions shared between DPDK libs/drivers
+  but not part of public API
+- Export macros go in `.c` files, not headers
+- The build system generates linker version maps automatically
+
+#### What NOT to Review
+
+- Do **NOT** flag missing `version.map` updates - maps are auto-generated
+- Do **NOT** suggest adding symbols to `lib/*/version.map` files
+
+#### ABI Versioning for Changed Functions
+
+When changing the signature of an existing stable function, use versioning macros
+from `rte_function_versioning.h`:
+
+- `RTE_VERSION_SYMBOL` - Create versioned symbol for backward compatibility
+- `RTE_DEFAULT_SYMBOL` - Mark the new default version
+
+Follow ABI policy and versioning guidelines in the contributor documentation.
+Enable ABI checks with `DPDK_ABI_REF_VERSION` environment variable.
+
+---
+
+## LTS (Long Term Stable) Release Review
+
+LTS releases are DPDK versions ending in `.11` (e.g., 23.11, 22.11,
+21.11, 20.11, 19.11). When reviewing patches targeting an LTS branch,
+apply stricter criteria:
+
+### LTS-Specific Rules
+
+- **Only bug fixes allowed** -- no new features
+- **No new APIs** (experimental or stable)
+- **ABI must remain unchanged** -- no symbol additions, removals,
+  or signature changes
+- Backported fixes should reference the original commit with a
+  `Fixes:` tag
+- Copyright years should reflect when the code was originally
+  written
+- Be conservative: reject changes that are not clearly bug fixes
+
+### What to Flag on LTS Branches
+
+**Error:**
+- New feature code (new functions, new driver capabilities)
+- New experimental or stable API additions
+- ABI changes (new or removed symbols, changed function signatures)
+- Changes that add new configuration options or parameters
+
+**Warning:**
+- Large refactoring that goes beyond what is needed for a fix
+- Missing `Fixes:` tag on a backported bug fix
+- Missing `Cc: stable@dpdk.org`
+
+### When LTS Rules Apply
+
+LTS rules apply when the reviewer is told the target release is an
+LTS version (via the `--release` option or equivalent). If no
+release is specified, assume the patch targets the main development
+branch where new features and APIs are allowed.
+
+---
+
+## Patch Validation Checklist
+
+### Commit Message
+
+- [ ] Subject line <=60 characters
+- [ ] Subject is lowercase (except acronyms from words-case.txt)
+- [ ] Correct component prefix (e.g., `net/ixgbe:` not `ixgbe:`)
+- [ ] No `lib/` prefix for libraries
+- [ ] Imperative mood, no trailing period
+- [ ] No tabs, leading/trailing spaces, or punctuation marks
+- [ ] Body wrapped at 75 characters
+- [ ] Body does not start with "It"
+- [ ] `Signed-off-by:` present with real name and valid email
+- [ ] `Fixes:` tag present for bug fixes with 12-char SHA and exact subject
+- [ ] `Coverity issue:` tag present if Coverity mentioned
+- [ ] `Bugzilla ID:` tag present if Bugzilla mentioned
+- [ ] `Cc: stable@dpdk.org` for stable backport candidates
+- [ ] Tags in correct order with blank line separator
+
+### License
+
+- [ ] SPDX identifier on first line (or second for scripts)
+- [ ] Copyright line follows SPDX
+- [ ] Blank line after copyright before code
+- [ ] Appropriate license for file type
+
+### Code Style
+
+- [ ] Lines <=100 characters
+- [ ] Hard tabs for indentation, spaces for alignment
+- [ ] No trailing whitespace
+- [ ] Proper include order
+- [ ] Header guards present
+- [ ] `rte_`/`RTE_` prefix on external symbols
+- [ ] Driver/library global variables use unique prefixes (e.g., `virtio_`, `mlx5_`)
+- [ ] No prohibited terminology
+- [ ] Proper brace style
+- [ ] Function return type on own line
+- [ ] Explicit comparisons: `== NULL`, `== 0`, `!= NULL`, `!= 0`
+- [ ] No forbidden tokens (see table above)
+- [ ] No unnecessary code patterns (see section above)
+- [ ] No usage of deprecated APIs, macros, or functions
+- [ ] Process-shared primitives in shared memory use `PTHREAD_PROCESS_SHARED`
+
+### API Tags
+
+- [ ] `__rte_experimental` alone on line, only in headers
+- [ ] `__rte_internal` alone on line, only in headers
+- [ ] Alignment attributes only on struct/union
+- [ ] Packed attributes properly paired
+- [ ] New public functions have `RTE_EXPORT_*` macro in `.c` file
+- [ ] Experimental functions use `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, version)`
+
+### Structure
+
+- [ ] Each commit compiles independently
+- [ ] Code and docs updated together
+- [ ] Documentation matches code behavior
+- [ ] PMD features match `doc/guides/nics/features/` matrix
+- [ ] Device operations match documentation (per `features.rst` mappings)
+- [ ] Tests added/updated as needed
+- [ ] Functional tests use TEST_ASSERT macros and unit_test_suite_runner
+- [ ] New APIs marked as `__rte_experimental`
+- [ ] New APIs have testpmd hooks and functional tests
+- [ ] Current release notes updated for significant changes
+- [ ] Release notes updated for API changes
+- [ ] Release notes updated for new drivers or subsystems
+
+---
+
+## Meson Build Files
+
+### Style Requirements
+
+- 4-space indentation (no tabs)
+- Line continuations double-indented
+- Lists alphabetically ordered
+- Short lists (<=3 items): single line, no trailing comma
+- Long lists: one item per line, trailing comma on last item
+- No strict line length limit for meson files; lines under 100 characters are acceptable
+
+```python
+# Short list
+sources = files('file1.c', 'file2.c')
+
+# Long list
+headers = files(
+	'header1.h',
+	'header2.h',
+	'header3.h',
+)
+```
+
+---
+
+## Python Code
+
+- Must comply with formatting standards
+- Use **`black`** for code formatting validation
+- Line length acceptable up to 100 characters
+
+---
+
+## Validation Tools
+
+Run these before submitting:
+
+```bash
+# Check commit messages
+devtools/check-git-log.sh -n1
+
+# Check patch format and forbidden tokens
+devtools/checkpatches.sh -n1
+
+# Check maintainers coverage
+devtools/check-maintainers.sh
+
+# Build validation
+devtools/test-meson-builds.sh
+
+# Find maintainers for your patch
+devtools/get-maintainer.sh <patch-file>
+```
+
+---
+
+## Severity Levels for AI Review
+
+**Error** (must fix):
+
+*Correctness bugs (highest value findings):*
+- Use-after-free
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference on reachable code path
+- Buffer overflow or out-of-bounds access
+- Missing error check on a function that can fail, leading to undefined behavior
+- Race condition on shared mutable state without synchronization
+- Error path that skips necessary cleanup
+
+*Process and format errors:*
+- Missing or malformed SPDX license
+- Missing Signed-off-by
+- Subject line over 60 characters
+- Body lines over 75 characters
+- Wrong tag order or format
+- Missing required tags (Fixes, Coverity issue, Bugzilla ID)
+- Forbidden tokens in code
+- `__rte_experimental`/`__rte_internal` in .c files or not alone on line
+- Compilation failures
+- ABI breaks without proper versioning
+- pthread mutex/cond/rwlock in shared memory without `PTHREAD_PROCESS_SHARED`
+
+**Warning** (should fix):
+- Subject line style issues (case, punctuation)
+- Wrong component prefix
+- Missing Cc: stable@dpdk.org for fixes
+- Documentation gaps
+- Documentation does not match code behavior
+- PMD features missing from `doc/guides/nics/features/` matrix
+- Device operations not documented per `features.rst` mappings
+- Missing tests
+- Functional tests not using TEST_ASSERT macros or unit_test_suite_runner
+- New API not marked as `__rte_experimental`
+- New API without testpmd hooks or functional tests
+- New public function missing `RTE_EXPORT_*` macro
+- API changes without release notes
+- New drivers or subsystems without release notes
+- Implicit comparisons (`!ptr` instead of `ptr == NULL`)
+- Unnecessary variable initialization
+- Unnecessary casts of `void *`
+- Unnecessary NULL checks before free
+- Inappropriate use of `rte_malloc()` or `rte_memcpy()`
+- Use of `perror()`, `printf()`, `fprintf()` in libraries or drivers (allowed in examples and test code)
+- Driver/library global variables without unique prefixes (static linking clash risk)
+- Usage of deprecated APIs, macros, or functions in new code
+
+**Do NOT flag** (common false positives):
+- Missing `version.map` updates (maps are auto-generated from `RTE_EXPORT_*` macros)
+- Suggesting manual edits to any `version.map` file
+- Copyright years within valid range (2013 to current year)
+- Copyright held by individuals (never speculate about employers, company policies, or who "should" hold copyright)
+- SPDX/copyright format that matches the standard DPDK format (do not call it "different" or "unusual")
+- Meson file lines under 100 characters
+- Case-sensitive term violations in code comments (words-case.txt applies to commit messages only)
+- Comparisons using `== 0`, `!= 0`, `== NULL`, `!= NULL` as "implicit" (these ARE explicit)
+- Comparisons wrapped in `likely()` or `unlikely()` macros - these are still explicit if using == or !=
+- Anything you determine is correct (do not mention non-issues or say "No issue here")
+- `REGISTER_FAST_TEST` using `NOHUGE_OK`/`ASAN_OK` macros (this is the correct current format)
+- Tag ordering: Signed-off-by directly after commit body (when no Fixes/Cc tags) is CORRECT
+- Tag ordering: no blank line between Fixes/Cc, blank line before Signed-off-by (when Fixes/Cc present) is CORRECT
+- Missing release notes for test-only changes (unit tests do not require release notes)
+- Missing release notes for internal APIs or helper functions (only public APIs need release notes)
+- Subject lines that are within the 60 character limit (count first, do not guess)
+- Any item you later correct with "(Correction: ...)" or "actually acceptable" - just omit it
+- Vague concerns ("should be verified", "should be checked") - if you're not sure it's wrong, don't flag it
+- Items where you say "which is correct" or "this is correct" - if it's correct, don't mention it at all
+- Fixes tags with correct format (12-char SHA, quoted subject) - NEVER ask for verification that commit exists
+- Items where you conclude "no issue here" or "this is actually correct" - omit these entirely
+- Cross-patch compilation dependencies - you cannot determine patch ordering correctness from review
+- Claims that a symbol "was removed in patch N" causing issues in patch M - assume author ordered correctly
+- Any speculation about whether patches will compile when applied in sequence
+- Mutexes/locks in process-private memory (standard `malloc`, stack, static non-shared) - these don't need `PTHREAD_PROCESS_SHARED`
+- Use of `rte_spinlock_t` or `rte_rwlock_t` in shared memory (these work correctly without special init)
+
+**Info** (consider):
+- Minor style preferences
+- Optimization suggestions
+- Alternative approaches
+
+---
+
+# Response Format
+
+When you identify an issue:
+1. **State the problem** (1 sentence)
+2. **Why it matters** (1 sentence, only if not obvious)
+3. **Suggested fix** (code snippet or specific action)
+
+Example:
+This could panic if the string is NULL.
+
+---
+
+## FINAL CHECK BEFORE SUBMITTING REVIEW
+
+Before outputting your review, do two separate passes:
+
+### Pass 1: Verify correctness bugs are included
+
+Ask: "Did I trace every error path for resource leaks? Did I check
+for use-after-free? Did I verify error codes are propagated?"
+
+If you identified a potential correctness bug but talked yourself
+out of it, **add it back**. It is better to report a possible bug
+than to miss a real one.
+
+### Pass 2: Remove style/process false positives
+
+For EACH style/process item, ask: "Did I conclude this is actually
+fine/correct/acceptable/no issue?"
+
+If YES, DELETE THAT ITEM. It should not be in your output.
+
+An item that says "X is wrong... actually this is correct" is a
+FALSE POSITIVE and must be removed. This applies to style, format,
+and process items only.
+
+**If your Errors section would be empty after this check, that's
+fine -- it means the patches are good.**
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v8 2/6] devtools: add multi-provider AI patch review script
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
@ 2026-02-09 19:48     ` Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-09 19:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

This is an AI generated script to review DPDK patches against
the AGENTS.md coding guidelines using AI language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script reads a patch file and the AGENTS.md guidelines, then
submits them to the selected AI provider for review. Results are
organized by severity level (Error, Warning, Info) as defined in
the guidelines.

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large patches
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/analyze-patch.py 0001-net-ixgbe-fix-something.patch
  ./devtools/analyze-patch.py -p xai my-patch.patch
  ./devtools/analyze-patch.py -l  # list providers

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/analyze-patch.py | 1334 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1334 insertions(+)
 create mode 100755 devtools/analyze-patch.py

diff --git a/devtools/analyze-patch.py b/devtools/analyze-patch.py
new file mode 100755
index 0000000000..c77908fb3c
--- /dev/null
+++ b/devtools/analyze-patch.py
@@ -0,0 +1,1334 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Analyze DPDK patches using AI providers.
+
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from datetime import date
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Iterator
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Large file handling modes
+LARGE_FILE_MODES = ["error", "truncate", "chunk", "commits-only", "summary"]
+
+# Approximate tokens per character (conservative estimate for code)
+CHARS_PER_TOKEN = 3.5
+
+# Default token limits by provider (leaving room for system prompt and response)
+PROVIDER_INPUT_LIMITS = {
+    "anthropic": 180000,  # 200K context, reserve for system/response
+    "openai": 115000,  # 128K context for gpt-4o
+    "xai": 115000,  # Assume similar to OpenAI
+    "google": 900000,  # Gemini has 1M context
+}
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4o",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-3",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-2.0-flash",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# LTS releases: any DPDK release with minor version .11
+# (e.g., 19.11, 20.11, 21.11, 22.11, 23.11, 24.11, 25.11, ...)
+
+SYSTEM_PROMPT_BASE = """\
+You are an expert DPDK code reviewer. Analyze patches for compliance with \
+DPDK coding standards and contribution guidelines. Provide clear, actionable \
+feedback organized by severity (Error, Warning, Info) as defined in the \
+guidelines."""
+
+LTS_RULES = """
+LTS (Long Term Stable) branch rules apply:
+- Only bug fixes allowed, no new features
+- No new APIs (experimental or stable)
+- ABI must remain unchanged
+- Backported fixes should reference the original commit with Fixes: tag
+- Copyright years should reflect when the code was originally written
+- Be conservative: reject changes that aren't clearly bug fixes"""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """Provide your review in plain text format.""",
+    "markdown": """Provide your review in Markdown format with:
+- Headers (##) for each severity level (Errors, Warnings, Info)
+- Bullet points for individual issues
+- Code blocks (```) for code references
+- Bold (**) for emphasis on key points""",
+    "html": """Provide your review in HTML format with:
+- <h2> tags for each severity level (Errors, Warnings, Info)
+- <ul>/<li> for individual issues
+- <pre><code> for code references
+- <strong> for emphasis on key points
+- Use appropriate semantic HTML tags
+- Do NOT include <html>, <head>, or <body> tags - just the content""",
+    "json": """Provide your review in JSON format with this structure:
+{
+  "summary": "Brief one-line summary of the review",
+  "errors": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "warnings": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "info": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "passed_checks": ["list of checks that passed"],
+  "overall_status": "PASS|WARN|FAIL"
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """Please review the following DPDK patch file '{patch_name}' \
+against the AGENTS.md guidelines. Check for:
+
+1. Commit message format (subject line, body, tags)
+2. License/copyright compliance
+3. C coding style issues
+4. API and documentation requirements
+5. Any other guideline violations
+
+{format_instruction}
+
+--- PATCH CONTENT ---
+"""
+
+
+def error(msg):
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key):
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def is_lts_release(release):
+    """Check if a release is an LTS release.
+
+    Per DPDK project guidelines, any release with minor version .11
+    is an LTS release (e.g., 19.11, 21.11, 23.11, 24.11, 25.11).
+    """
+    if not release:
+        return False
+    # Check for explicit -lts suffix
+    if "-lts" in release.lower():
+        return True
+    # Extract base version (e.g., "23.11" from "23.11.1" or "23.11-rc1")
+    version = release.split("-")[0]
+    parts = version.split(".")
+    if len(parts) >= 2:
+        try:
+            minor = int(parts[1])
+            return minor == 11
+        except ValueError:
+            pass
+    return False
+
+
+def estimate_tokens(text):
+    """Estimate token count from text length."""
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def split_mbox_patches(content):
+    """Split an mbox file into individual patches."""
+    patches = []
+    current_patch = []
+    in_patch = False
+
+    for line in content.split("\n"):
+        # Detect start of new message in mbox format
+        if line.startswith("From ") and (
+            " Mon " in line
+            or " Tue " in line
+            or " Wed " in line
+            or " Thu " in line
+            or " Fri " in line
+            or " Sat " in line
+            or " Sun " in line
+        ):
+            if current_patch:
+                patches.append("\n".join(current_patch))
+            current_patch = [line]
+            in_patch = True
+        elif in_patch:
+            current_patch.append(line)
+
+    # Don't forget the last patch
+    if current_patch:
+        patches.append("\n".join(current_patch))
+
+    return patches if patches else [content]
+
+
+def extract_commit_messages(content):
+    """Extract only commit messages from patch content."""
+    patches = split_mbox_patches(content)
+    messages = []
+
+    for patch in patches:
+        lines = patch.split("\n")
+        msg_lines = []
+        in_headers = True
+        in_body = False
+        found_subject = False
+
+        for line in lines:
+            # Collect headers we care about
+            if in_headers:
+                if line.startswith("Subject:"):
+                    msg_lines.append(line)
+                    found_subject = True
+                elif line.startswith(("From:", "Date:")):
+                    msg_lines.append(line)
+                elif line.startswith((" ", "\t")) and found_subject:
+                    # Subject continuation
+                    msg_lines.append(line)
+                elif line == "":
+                    if found_subject:
+                        in_headers = False
+                        in_body = True
+                        msg_lines.append("")
+            elif in_body:
+                # Stop at the diff
+                if line.startswith("---") and not line.startswith("----"):
+                    break
+                if line.startswith("diff --git"):
+                    break
+                msg_lines.append(line)
+
+        if msg_lines:
+            messages.append("\n".join(msg_lines))
+
+    return "\n\n---\n\n".join(messages)
+
+
+def truncate_content(content, max_tokens, provider):
+    """Truncate content to fit within token limit."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN)
+
+    if len(content) <= max_chars:
+        return content, False
+
+    # Try to truncate at a reasonable boundary
+    truncated = content[:max_chars]
+
+    # Find last complete diff hunk or patch boundary
+    last_diff = truncated.rfind("\ndiff --git")
+    last_patch = truncated.rfind("\nFrom ")
+
+    if last_diff > max_chars * 0.5:
+        truncated = truncated[:last_diff]
+    elif last_patch > max_chars * 0.5:
+        truncated = truncated[:last_patch]
+
+    truncated += "\n\n[... Content truncated due to size limits ...]\n"
+    return truncated, True
+
+
+def chunk_content(content, max_tokens, provider) -> Iterator[tuple[str, int, int]]:
+    """Split content into chunks that fit within token limit.
+
+    Yields tuples of (chunk_content, chunk_number, total_chunks).
+    """
+    patches = split_mbox_patches(content)
+
+    if len(patches) == 1:
+        # Single large patch - split by diff sections
+        yield from chunk_single_patch(content, max_tokens)
+        return
+
+    # Multiple patches - group them to fit within limits
+    chunks = []
+    current_chunk = []
+    current_size = 0
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)  # 90% to leave margin
+
+    for patch in patches:
+        patch_size = len(patch)
+        if current_size + patch_size > max_chars and current_chunk:
+            chunks.append("\n".join(current_chunk))
+            current_chunk = []
+            current_size = 0
+
+        if patch_size > max_chars:
+            # Single patch too large, truncate it
+            if current_chunk:
+                chunks.append("\n".join(current_chunk))
+                current_chunk = []
+                current_size = 0
+            truncated, _ = truncate_content(patch, max_tokens * 0.9, provider)
+            chunks.append(truncated)
+        else:
+            current_chunk.append(patch)
+            current_size += patch_size
+
+    if current_chunk:
+        chunks.append("\n".join(current_chunk))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def chunk_single_patch(content, max_tokens) -> Iterator[tuple[str, int, int]]:
+    """Split a single large patch by diff sections."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)
+
+    # Extract header (everything before first diff)
+    first_diff = content.find("\ndiff --git")
+    if first_diff == -1:
+        # No diff sections, just truncate
+        truncated, _ = truncate_content(content, max_tokens * 0.9, "anthropic")
+        yield truncated, 1, 1
+        return
+
+    header = content[: first_diff + 1]
+    diff_content = content[first_diff + 1 :]
+
+    # Split by diff sections
+    diffs = []
+    current_diff = []
+    for line in diff_content.split("\n"):
+        if line.startswith("diff --git") and current_diff:
+            diffs.append("\n".join(current_diff))
+            current_diff = []
+        current_diff.append(line)
+    if current_diff:
+        diffs.append("\n".join(current_diff))
+
+    # Group diffs into chunks
+    chunks = []
+    current_chunk_diffs = []
+    current_size = len(header)
+
+    for diff in diffs:
+        diff_size = len(diff)
+        if current_size + diff_size > max_chars and current_chunk_diffs:
+            chunks.append(header + "\n".join(current_chunk_diffs))
+            current_chunk_diffs = []
+            current_size = len(header)
+
+        if diff_size + len(header) > max_chars:
+            # Single diff too large
+            if current_chunk_diffs:
+                chunks.append(header + "\n".join(current_chunk_diffs))
+                current_chunk_diffs = []
+            truncated_diff = diff[: max_chars - len(header) - 100]
+            truncated_diff += "\n[... diff truncated ...]\n"
+            chunks.append(header + truncated_diff)
+            current_size = len(header)
+        else:
+            current_chunk_diffs.append(diff)
+            current_size += diff_size
+
+    if current_chunk_diffs:
+        chunks.append(header + "\n".join(current_chunk_diffs))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def get_summary_prompt():
+    """Get prompt modifications for summary mode."""
+    return """
+NOTE: This is a LARGE patch series. Provide a HIGH-LEVEL summary review only:
+- Focus on overall architecture and design concerns
+- Check commit message formatting across the series
+- Identify any obvious policy violations
+- Do NOT attempt detailed line-by-line code review
+- Summarize the scope and purpose of the changes
+"""
+
+
+def format_combined_reviews(reviews, output_format, patch_name):
+    """Combine multiple chunk/patch reviews into a single output."""
+    if output_format == "json":
+        combined = {
+            "patch_file": patch_name,
+            "sections": [
+                {"label": label, "review": review} for label, review in reviews
+            ],
+        }
+        return json.dumps(combined, indent=2)
+    elif output_format == "html":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"<h2>{label}</h2>\n{review}")
+        return "\n<hr>\n".join(sections)
+    elif output_format == "markdown":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"## {label}\n\n{review}")
+        return "\n\n---\n\n".join(sections)
+    else:  # text
+        sections = []
+        for label, review in reviews:
+            sections.append(f"=== {label} ===\n\n{review}")
+        return "\n\n" + "=" * 60 + "\n\n".join(sections)
+
+
+def build_system_prompt(review_date, release):
+    """Build system prompt with date and release context."""
+    prompt = SYSTEM_PROMPT_BASE
+    prompt += f"\n\nCurrent date: {review_date}."
+
+    if release:
+        prompt += f"\nTarget DPDK release: {release}."
+        if is_lts_release(release):
+            prompt += LTS_RULES
+        else:
+            prompt += "\nThis is a main branch or standard release."
+            prompt += "\nNew features and experimental APIs are allowed."
+
+    return prompt
+
+
+def build_anthropic_request(
+    model,
+    max_tokens,
+    system_prompt,
+    agents_content,
+    patch_content,
+    patch_name,
+    output_format="text",
+):
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": system_prompt},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model,
+    max_tokens,
+    system_prompt,
+    agents_content,
+    patch_content,
+    patch_name,
+    output_format="text",
+):
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": system_prompt},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens,
+    system_prompt,
+    agents_content,
+    patch_content,
+    patch_name,
+    output_format="text",
+):
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": system_prompt}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + patch_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider,
+    api_key,
+    model,
+    max_tokens,
+    system_prompt,
+    agents_content,
+    patch_content,
+    patch_name,
+    output_format="text",
+    verbose=False,
+):
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: {usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: {usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def get_last_message_id(patch_content):
+    """Extract Message-ID from the last patch in an mbox."""
+    msg_ids = re.findall(
+        r"^Message-I[Dd]:\s*(.+)$", patch_content, re.MULTILINE | re.IGNORECASE
+    )
+    if msg_ids:
+        msg_id = msg_ids[-1].strip()
+        # Normalize: remove < > and add them back
+        msg_id = msg_id.strip("<>")
+        return f"<{msg_id}>"
+    return None
+
+
+def get_last_subject(patch_content):
+    """Extract subject from the last patch in an mbox."""
+    # Find all Subject lines with potential continuations
+    subjects = []
+    lines = patch_content.split("\n")
+    i = 0
+    while i < len(lines):
+        if lines[i].lower().startswith("subject:"):
+            subject = lines[i][8:].strip()
+            i += 1
+            # Handle continuation lines
+            while i < len(lines) and lines[i].startswith((" ", "\t")):
+                subject += lines[i].strip()
+                i += 1
+            subjects.append(subject)
+        else:
+            i += 1
+    return subjects[-1] if subjects else None
+
+
+def send_email(
+    to_addrs, cc_addrs, from_addr, subject, in_reply_to, body, dry_run=False
+):
+    """Send review email using git send-email, sendmail, or msmtp."""
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    email_text = msg.as_string()
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(email_text, file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Write to temp file for git send-email
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".eml", delete=False) as f:
+        f.write(email_text)
+        temp_file = f.name
+
+    try:
+        # Try git send-email first
+        if get_git_config("sendemail.smtpserver"):
+            # Build command with all arguments
+            flat_cmd = ["git", "send-email", "--confirm=never", "--quiet"]
+            for addr in to_addrs:
+                flat_cmd.extend(["--to", addr])
+            for addr in cc_addrs:
+                flat_cmd.extend(["--cc", addr])
+            if from_addr:
+                flat_cmd.extend(["--from", from_addr])
+            if in_reply_to:
+                flat_cmd.extend(["--in-reply-to", in_reply_to])
+            flat_cmd.append(temp_file)
+
+            try:
+                subprocess.run(flat_cmd, check=True, capture_output=True)
+                print("Email sent via git send-email", file=sys.stderr)
+                return True
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+
+        # Try sendmail
+        try:
+            subprocess.run(
+                ["sendmail", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via sendmail", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        # Try msmtp
+        try:
+            subprocess.run(
+                ["msmtp", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via msmtp", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        error("Could not send email. Configure git send-email, sendmail, or msmtp.")
+
+    finally:
+        os.unlink(temp_file)
+
+
+def list_providers():
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Analyze DPDK patches using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s patch.patch                    # Review with default settings
+    %(prog)s -p openai my-patch.patch       # Use OpenAI ChatGPT
+    %(prog)s -f markdown patch.patch        # Output as Markdown
+    %(prog)s -f json -o review.json patch.patch  # Save JSON to file
+    %(prog)s -f html -o review.html patch.patch  # Save HTML to file
+    %(prog)s -r 24.11 patch.patch           # Review for specific release
+    %(prog)s -r 24.11-lts patch.patch       # Review for LTS branch
+    %(prog)s --send-email --to dev@dpdk.org series.mbox
+    %(prog)s --send-email --to dev@dpdk.org --dry-run series.mbox
+
+Large File Handling:
+    %(prog)s --split-patches series.mbox    # Review each patch separately
+    %(prog)s --split-patches --patch-range 1-5 series.mbox  # Review patches 1-5
+    %(prog)s --large-file=truncate patch.mbox   # Truncate to fit limit
+    %(prog)s --large-file=commits-only series.mbox  # Review commit messages only
+    %(prog)s --large-file=summary series.mbox   # High-level summary only
+    %(prog)s --large-file=chunk series.mbox     # Split and review in chunks
+
+Large File Modes:
+    error       - Fail with error (default)
+    truncate    - Truncate content to fit token limit
+    chunk       - Split into chunks and review each
+    commits-only - Extract and review only commit messages
+    summary     - Request high-level summary review
+
+LTS Releases:
+    Use -r/--release with LTS version (e.g., 24.11-lts, 23.11) to enable
+    stricter review rules: bug fixes only, no new features or APIs.
+    Any DPDK release with minor version .11 is an LTS release.
+        """,
+    )
+
+    parser.add_argument("patch_file", nargs="?", help="Patch file to analyze")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=4096,
+        help="Max tokens for response (default: 4096)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output",
+        metavar="FILE",
+        help="Write output to file instead of stdout",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Date and release options
+    parser.add_argument(
+        "-D",
+        "--date",
+        metavar="YYYY-MM-DD",
+        help="Review date context (default: today)",
+    )
+    parser.add_argument(
+        "-r",
+        "--release",
+        metavar="VERSION",
+        help="Target DPDK release (e.g., 24.11, 23.11-lts)",
+    )
+
+    # Large file handling options
+    large_group = parser.add_argument_group("Large File Handling")
+    large_group.add_argument(
+        "--large-file",
+        choices=LARGE_FILE_MODES,
+        default="error",
+        metavar="MODE",
+        help="How to handle large files: error (default), truncate, "
+        "chunk, commits-only, summary",
+    )
+    large_group.add_argument(
+        "--max-tokens",
+        type=int,
+        metavar="N",
+        help="Max input tokens (default: provider-specific)",
+    )
+    large_group.add_argument(
+        "--split-patches",
+        action="store_true",
+        help="Split mbox into individual patches and review each separately",
+    )
+    large_group.add_argument(
+        "--patch-range",
+        metavar="N-M",
+        help="Review only patches N through M (1-indexed, use with --split-patches)",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check patch file is provided
+    if not args.patch_file:
+        parser.error("patch_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    patch_path = Path(args.patch_file)
+    if not patch_path.exists():
+        error(f"Patch file not found: {args.patch_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Determine review date
+    review_date = args.date or date.today().isoformat()
+
+    # Build system prompt with date and release context
+    system_prompt = build_system_prompt(review_date, args.release)
+
+    # Read files
+    agents_content = agents_path.read_text()
+    patch_content = patch_path.read_text()
+    patch_name = patch_path.name
+
+    # Determine max tokens for this provider
+    max_input_tokens = args.max_tokens or PROVIDER_INPUT_LIMITS.get(
+        args.provider, 100000
+    )
+
+    # Estimate token count
+    estimated_tokens = estimate_tokens(patch_content + agents_content)
+
+    # Parse patch range if specified
+    patch_start, patch_end = None, None
+    if args.patch_range:
+        try:
+            if "-" in args.patch_range:
+                start, end = args.patch_range.split("-", 1)
+                patch_start = int(start)
+                patch_end = int(end)
+            else:
+                patch_start = patch_end = int(args.patch_range)
+        except ValueError:
+            error(f"Invalid --patch-range format: {args.patch_range}")
+
+    # Handle --split-patches mode
+    if args.split_patches:
+        patches = split_mbox_patches(patch_content)
+        total_patches = len(patches)
+
+        if total_patches == 1:
+            print(
+                f"Note: Only 1 patch found in mbox, --split-patches has no effect",
+                file=sys.stderr,
+            )
+        else:
+            print(
+                f"Found {total_patches} patches in mbox",
+                file=sys.stderr,
+            )
+
+            # Apply patch range filter
+            if patch_start is not None:
+                if patch_start < 1 or patch_start > total_patches:
+                    error(
+                        f"Patch range start {patch_start} out of range (1-{total_patches})"
+                    )
+                if patch_end < patch_start or patch_end > total_patches:
+                    error(
+                        f"Patch range end {patch_end} out of range ({patch_start}-{total_patches})"
+                    )
+                patches = patches[patch_start - 1 : patch_end]
+                print(
+                    f"Reviewing patches {patch_start}-{patch_end} ({len(patches)} patches)",
+                    file=sys.stderr,
+                )
+
+            # Review each patch separately
+            all_reviews = []
+            for i, patch in enumerate(patches, patch_start or 1):
+                patch_label = f"Patch {i}/{total_patches}"
+                print(f"\nReviewing {patch_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    patch,
+                    f"{patch_name} ({patch_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((patch_label, review_text))
+
+            # Combine reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal API call
+            estimated_tokens = 0  # Bypass size check since we've already processed
+
+    # Check if content is too large
+    is_large = estimated_tokens > max_input_tokens
+
+    if is_large:
+        print(
+            f"Warning: Estimated {estimated_tokens:,} tokens exceeds limit of "
+            f"{max_input_tokens:,}",
+            file=sys.stderr,
+        )
+
+        if args.large_file == "error":
+            error(
+                f"Patch file too large ({estimated_tokens:,} tokens). "
+                f"Use --large-file=truncate|chunk|commits-only|summary to handle, "
+                f"or --split-patches to review patches individually."
+            )
+        elif args.large_file == "truncate":
+            print("Truncating content to fit token limit...", file=sys.stderr)
+            patch_content, was_truncated = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+            if was_truncated:
+                print("Content was truncated.", file=sys.stderr)
+        elif args.large_file == "commits-only":
+            print("Extracting commit messages only...", file=sys.stderr)
+            patch_content = extract_commit_messages(patch_content)
+            new_estimate = estimate_tokens(patch_content + agents_content)
+            print(
+                f"Reduced to ~{new_estimate:,} tokens (commit messages only)",
+                file=sys.stderr,
+            )
+            if new_estimate > max_input_tokens:
+                patch_content, _ = truncate_content(
+                    patch_content, max_input_tokens, args.provider
+                )
+        elif args.large_file == "summary":
+            print("Using summary mode for large patch...", file=sys.stderr)
+            system_prompt += get_summary_prompt()
+            patch_content, _ = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+        elif args.large_file == "chunk":
+            print("Processing in chunks...", file=sys.stderr)
+            all_reviews = []
+            for chunk, chunk_num, total_chunks in chunk_content(
+                patch_content, max_input_tokens, args.provider
+            ):
+                chunk_label = f"Chunk {chunk_num}/{total_chunks}"
+                print(f"Reviewing {chunk_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    chunk,
+                    f"{patch_name} ({chunk_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((chunk_label, review_text))
+
+            # Combine chunk reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal single API call below
+            estimated_tokens = 0
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Review date: {review_date}", file=sys.stderr)
+        if args.release:
+            lts_status = " (LTS)" if is_lts_release(args.release) else ""
+            print(f"Target release: {args.release}{lts_status}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Patch file: {args.patch_file}", file=sys.stderr)
+        print(f"Estimated tokens: {estimated_tokens:,}", file=sys.stderr)
+        print(f"Max input tokens: {max_input_tokens:,}", file=sys.stderr)
+        if args.large_file != "error":
+            print(f"Large file mode: {args.large_file}", file=sys.stderr)
+        if args.split_patches:
+            print("Split patches: yes", file=sys.stderr)
+        if args.output:
+            print(f"Output file: {args.output}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API (unless already processed via chunks/split)
+    if estimated_tokens > 0:  # Not already processed
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            args.output_format,
+            args.verbose,
+        )
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Format output based on requested format
+    provider_name = config["name"]
+
+    if args.output_format == "json":
+        # For JSON, try to parse and add metadata
+        try:
+            review_data = json.loads(review_text)
+        except json.JSONDecodeError:
+            # If AI didn't return valid JSON, wrap the text
+            review_data = {"raw_review": review_text}
+
+        output_data = {
+            "metadata": {
+                "patch_file": patch_name,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+                "review_date": review_date,
+                "target_release": args.release,
+                "is_lts": is_lts_release(args.release) if args.release else False,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+    elif args.output_format == "html":
+        # Wrap HTML content with header
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"<br>Target release: {args.release}{lts_badge}"
+        output_text = f"""<!-- AI-generated review of {patch_name} -->
+<!-- Reviewed using {provider_name} ({model}) on {review_date} -->
+<div class="patch-review">
+<h1>Patch Review: {patch_name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model}) on {review_date}{release_info}</p>
+{review_text}
+</div>
+"""
+    elif args.output_format == "markdown":
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"\n*Target release: {args.release}{lts_badge}*\n"
+        output_text = f"""# Patch Review: {patch_name}
+
+*Reviewed by {provider_name} ({model}) on {review_date}*
+{release_info}
+{review_text}
+"""
+    else:  # text
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+        output_text = f"=== Patch Review: {patch_name} (via {provider_name}) ===\n"
+        output_text += f"Review date: {review_date}\n"
+        output_text += release_info
+        output_text += "\n" + review_text
+
+    # Write output
+    if args.output:
+        Path(args.output).write_text(output_text)
+        print(f"Review written to: {args.output}", file=sys.stderr)
+    else:
+        print(output_text)
+
+    # Send email if requested
+    if args.send_email:
+        # Email always uses plain text - warn if different format requested
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        in_reply_to = get_last_message_id(patch_content)
+        orig_subject = get_last_subject(patch_content)
+
+        if orig_subject:
+            # Remove [PATCH n/m] prefix
+            review_subject = re.sub(r"^\[PATCH[^\]]*\]\s*", "", orig_subject)
+            review_subject = f"[REVIEW] {review_subject}"
+        else:
+            review_subject = f"[REVIEW] {patch_name}"
+
+        # Build email body - always use plain text version
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+
+        email_body = f"""AI-generated review of {patch_name}
+Reviewed using {provider_name} ({model}) on {review_date}
+{release_info}
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print(f"In-Reply-To: {in_reply_to}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            in_reply_to,
+            email_body,
+            args.dry_run,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v8 3/6] devtools: add compare-reviews.sh for multi-provider analysis
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-02-09 19:48     ` Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-09 19:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 192 ++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100755 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100755
index 0000000000..a63eeffb71
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,192 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -e
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+VERBOSE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            FORMAT="$2"
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"
+        echo ""
+        echo "Saved to: $OUTPUT_FILE"
+    else
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE"
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v8 4/6] devtools: add multi-provider AI documentation review script
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (2 preceding siblings ...)
  2026-02-09 19:48     ` [PATCH v8 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
@ 2026-02-09 19:48     ` Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-09 19:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add review-doc.py script that reviews DPDK documentation files for
spelling, grammar, technical correctness, and clarity using AI
language models. Supports batch processing of multiple files.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

Output formats (-f/--format):
  - text: plain text with extractable diff/msg markers (default)
  - markdown: formatted review document
  - html: complete HTML document with styling
  - json: structured data with metadata

For each input file, the script produces:
  - <basename>.{txt,md,html,json}: review in selected format
  - <basename>.diff: unified diff (text/json, or with -d flag)
  - <basename>.msg: commit message (text/json, or with -d flag)

The commit message prefix is automatically determined from the
file path (e.g., doc/guides/prog_guide: for programmer's guide).

Features:
  - Multiple file processing with glob support
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Configurable output directory via -o/--output-dir option
  - Output format selection via -f/--format option
  - Force diff/msg generation via -d/--diff option
  - Quiet mode (-q) suppresses stdout output
  - Verbose mode (-v) shows token usage and API details
  - Email integration using git sendemail configuration
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/review-doc.py doc/guides/prog_guide/mempool_lib.rst
  ./devtools/review-doc.py doc/guides/nics/*.rst
  ./devtools/review-doc.py -f html -d -o /tmp doc/guides/nics/*.rst
  ./devtools/review-doc.py --send-email --to dev@dpdk.org file.rst

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/review-doc.py | 1098 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1098 insertions(+)
 create mode 100755 devtools/review-doc.py

diff --git a/devtools/review-doc.py b/devtools/review-doc.py
new file mode 100755
index 0000000000..1366aa0f85
--- /dev/null
+++ b/devtools/review-doc.py
@@ -0,0 +1,1098 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Review DPDK documentation files using AI providers.
+
+Produces a diff file and commit message compliant with DPDK standards.
+Accepts multiple documentation files and generates output for each.
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import getpass
+import json
+import os
+import re
+import smtplib
+import ssl
+import subprocess
+import sys
+from email.message import EmailMessage
+from pathlib import Path
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Map output format to file extension
+FORMAT_EXTENSIONS = {
+    "text": ".txt",
+    "markdown": ".md",
+    "html": ".html",
+    "json": ".json",
+}
+
+# Additional markers for extracting diff/msg (used with --diff flag)
+DIFF_MARKERS_INSTRUCTION = """
+
+ADDITIONALLY, at the end of your response, include these exact markers for automated extraction:
+---COMMIT_MESSAGE_START---
+(same commit message as above)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(same unified diff as above)
+---UNIFIED_DIFF_END---
+"""
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4o",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-3",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-2.0-flash",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# Commit prefix mappings based on file path
+COMMIT_PREFIX_MAP = [
+    ("doc/guides/prog_guide/", "doc/guides/prog_guide:"),
+    ("doc/guides/sample_app_ug/", "doc/guides/sample_app:"),
+    ("doc/guides/nics/", "doc/guides/nics:"),
+    ("doc/guides/cryptodevs/", "doc/guides/cryptodevs:"),
+    ("doc/guides/compressdevs/", "doc/guides/compressdevs:"),
+    ("doc/guides/eventdevs/", "doc/guides/eventdevs:"),
+    ("doc/guides/rawdevs/", "doc/guides/rawdevs:"),
+    ("doc/guides/bbdevs/", "doc/guides/bbdevs:"),
+    ("doc/guides/gpus/", "doc/guides/gpus:"),
+    ("doc/guides/dmadevs/", "doc/guides/dmadevs:"),
+    ("doc/guides/regexdevs/", "doc/guides/regexdevs:"),
+    ("doc/guides/mldevs/", "doc/guides/mldevs:"),
+    ("doc/guides/rel_notes/", "doc/guides/rel_notes:"),
+    ("doc/guides/linux_gsg/", "doc/guides/linux_gsg:"),
+    ("doc/guides/freebsd_gsg/", "doc/guides/freebsd_gsg:"),
+    ("doc/guides/windows_gsg/", "doc/guides/windows_gsg:"),
+    ("doc/guides/tools/", "doc/guides/tools:"),
+    ("doc/guides/testpmd_app_ug/", "doc/guides/testpmd:"),
+    ("doc/guides/howto/", "doc/guides/howto:"),
+    ("doc/guides/contributing/", "doc/guides/contributing:"),
+    ("doc/guides/platform/", "doc/guides/platform:"),
+    ("doc/guides/", "doc:"),
+    ("doc/api/", "doc/api:"),
+    ("doc/", "doc:"),
+]
+
+SYSTEM_PROMPT = """\
+You are an expert technical documentation reviewer for DPDK.
+Your task is to review documentation files and suggest improvements for:
+- Spelling errors
+- Grammar issues
+- Technical correctness
+- Clarity and readability
+- Consistency with DPDK terminology
+
+IMPORTANT COMMIT MESSAGE RULES (from check-git-log.sh):
+- Subject line MUST be ≤60 characters
+- Format: "prefix: lowercase description"
+- First word after colon must be lowercase (except acronyms like Rx, Tx, VF, MAC, API)
+- Use imperative mood (e.g., "fix typo" not "fixed typo" or "fixes typo")
+- NO trailing period on subject line
+- NO punctuation marks: , ; ! ? & |
+- NO underscores in subject after colon
+- Body lines wrapped at 75 characters
+- Body must NOT start with "It"
+- Do NOT include Signed-off-by (user adds via git commit --sign)
+- Only use "Fixes:" tag for actual errors in documentation, not style improvements
+
+Case-sensitive terms (must use exact case):
+- Rx, Tx (not RX, TX, rx, tx)
+- VF, PF (not vf, pf)
+- MAC, VLAN, RSS, API
+- Linux, Windows, FreeBSD
+
+For style/clarity improvements, do NOT use Fixes tag.
+For actual errors (wrong information, broken examples), include Fixes tag \
+if you can identify the commit."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """
+OUTPUT FORMAT:
+You must output exactly two sections:
+
+1. COMMIT_MESSAGE section containing the complete commit message
+2. UNIFIED_DIFF section containing the unified diff
+
+Use these exact markers:
+---COMMIT_MESSAGE_START---
+(commit message here)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(unified diff here)
+---UNIFIED_DIFF_END---
+
+The diff should be in unified format that can be applied with "git apply".
+If no changes are needed, output empty sections with a note.""",
+    "markdown": """
+OUTPUT FORMAT:
+Provide your review in Markdown format with:
+
+## Summary
+Brief description of changes
+
+## Commit Message
+```
+(complete commit message here, ready to use)
+```
+
+## Changes
+For each change:
+### Issue N: Brief title
+- **Location**: file path and line
+- **Problem**: description
+- **Fix**: suggested correction
+
+## Unified Diff
+```diff
+(unified diff here)
+```""",
+    "html": """
+OUTPUT FORMAT:
+Provide your review in HTML format with:
+- <h2> for sections (Summary, Commit Message, Changes, Diff)
+- <pre><code> for commit message and diff
+- <ul>/<li> for individual issues
+- Do NOT include <html>, <head>, or <body> tags - just the content
+
+Include sections for: Summary, Commit Message, Changes, Unified Diff""",
+    "json": """
+OUTPUT FORMAT:
+Provide your review as JSON with this structure:
+{
+  "summary": "Brief description of changes",
+  "commit_message": "Complete commit message ready to use",
+  "changes": [
+    {
+      "type": "spelling|grammar|technical|clarity|style",
+      "location": "line number or section",
+      "original": "original text",
+      "suggested": "corrected text",
+      "reason": "why this change"
+    }
+  ],
+  "diff": "unified diff as a string",
+  "stats": {
+    "total_issues": 0,
+    "spelling": 0,
+    "grammar": 0,
+    "technical": 0,
+    "clarity": 0
+  }
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """\
+Review the following DPDK documentation file and provide improvements.
+
+File path: {doc_file}
+Commit message prefix to use: {commit_prefix}
+
+{format_instruction}
+
+---DOCUMENT CONTENT---
+"""
+
+
+def error(msg):
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key):
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_smtp_config():
+    """Get SMTP configuration from git config sendemail settings."""
+    config = {
+        "server": get_git_config("sendemail.smtpserver"),
+        "port": get_git_config("sendemail.smtpserverport"),
+        "user": get_git_config("sendemail.smtpuser"),
+        "encryption": get_git_config("sendemail.smtpencryption"),
+        "password": get_git_config("sendemail.smtppass"),
+    }
+
+    # Set defaults
+    if not config["port"]:
+        if config["encryption"] == "ssl":
+            config["port"] = "465"
+        else:
+            config["port"] = "587"
+
+    # Convert port to int
+    if config["port"]:
+        config["port"] = int(config["port"])
+
+    return config
+
+
+def get_commit_prefix(filepath):
+    """Determine commit message prefix from file path."""
+    for prefix_path, prefix in COMMIT_PREFIX_MAP:
+        if filepath.startswith(prefix_path):
+            return prefix
+    return "doc:"
+
+
+def build_anthropic_request(
+    model,
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+    include_diff_markers=False,
+):
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model,
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+    include_diff_markers=False,
+):
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+    include_diff_markers=False,
+):
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": SYSTEM_PROMPT}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + doc_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider,
+    api_key,
+    model,
+    max_tokens,
+    agents_content,
+    doc_content,
+    doc_file,
+    commit_prefix,
+    output_format="text",
+    include_diff_markers=False,
+    verbose=False,
+):
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: " f"{usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: " f"{usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def parse_review_text(review_text):
+    """Extract commit message and diff from text format response."""
+    commit_msg = ""
+    diff = ""
+
+    # Extract commit message
+    msg_match = re.search(
+        r"---COMMIT_MESSAGE_START---\s*\n(.*?)\n---COMMIT_MESSAGE_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if msg_match:
+        commit_msg = msg_match.group(1).strip()
+
+    # Extract unified diff
+    diff_match = re.search(
+        r"---UNIFIED_DIFF_START---\s*\n(.*?)\n---UNIFIED_DIFF_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if diff_match:
+        diff = diff_match.group(1).strip()
+        # Clean up any markdown code fence if present
+        diff = re.sub(r"^```diff\s*\n?", "", diff)
+        diff = re.sub(r"\n?```\s*$", "", diff)
+
+    return commit_msg, diff
+
+
+def strip_diff_markers(text):
+    """Remove the diff/msg extraction markers from text."""
+    # Remove commit message markers and content
+    text = re.sub(
+        r"\n*---COMMIT_MESSAGE_START---\s*\n.*?\n---COMMIT_MESSAGE_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    # Remove unified diff markers and content
+    text = re.sub(
+        r"\n*---UNIFIED_DIFF_START---\s*\n.*?\n---UNIFIED_DIFF_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    return text.strip()
+
+
+def send_email(
+    to_addrs,
+    cc_addrs,
+    from_addr,
+    subject,
+    in_reply_to,
+    body,
+    dry_run=False,
+    verbose=False,
+):
+    """Send review email via SMTP using git sendemail config."""
+    # Build email message
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(msg.as_string(), file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Get SMTP configuration from git config
+    smtp_config = get_smtp_config()
+
+    if not smtp_config["server"]:
+        error("No SMTP server configured. Set git config sendemail.smtpserver")
+
+    server = smtp_config["server"]
+    port = smtp_config["port"]
+    user = smtp_config["user"]
+    encryption = smtp_config["encryption"]
+
+    # Get password from environment or git config, or prompt
+    password = os.environ.get("SMTP_PASSWORD") or smtp_config["password"]
+    if user and not password:
+        password = getpass.getpass(f"SMTP password for {user}@{server}: ")
+
+    if verbose:
+        print(f"SMTP server: {server}:{port}", file=sys.stderr)
+        print(f"SMTP user: {user or '(none)'}", file=sys.stderr)
+        print(f"Encryption: {encryption or 'starttls'}", file=sys.stderr)
+
+    # Collect all recipients
+    all_recipients = list(to_addrs)
+    if cc_addrs:
+        all_recipients.extend(cc_addrs)
+
+    try:
+        if encryption == "ssl":
+            # SSL/TLS connection from the start (port 465)
+            context = ssl.create_default_context()
+            with smtplib.SMTP_SSL(server, port, context=context) as smtp:
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+        else:
+            # STARTTLS (port 587) or plain (port 25)
+            with smtplib.SMTP(server, port) as smtp:
+                smtp.ehlo()
+                if encryption == "tls" or port == 587:
+                    context = ssl.create_default_context()
+                    smtp.starttls(context=context)
+                    smtp.ehlo()
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+
+        print(f"Email sent via SMTP ({server}:{port})", file=sys.stderr)
+        return True
+
+    except smtplib.SMTPAuthenticationError as e:
+        error(f"SMTP authentication failed: {e}")
+    except smtplib.SMTPException as e:
+        error(f"SMTP error: {e}")
+    except OSError as e:
+        error(f"Connection error to {server}:{port}: {e}")
+
+
+def list_providers():
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Review DPDK documentation files using AI providers. "
+        "Accepts multiple files and generates output for each.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s doc/guides/prog_guide/mempool_lib.rst
+    %(prog)s doc/guides/nics/*.rst              # Review all NIC docs
+    %(prog)s -p openai -o /tmp doc/guides/nics/ixgbe.rst doc/guides/nics/i40e.rst
+    %(prog)s -f html -d -o /tmp/reviews doc/guides/nics/*.rst  # HTML + diff files
+    %(prog)s -f json -o /tmp doc/guides/howto/flow_bifurcation.rst
+    %(prog)s --send-email --to dev@dpdk.org doc/guides/nics/ixgbe.rst
+
+Output files (in output-dir):
+    <basename>.txt|.md|.html|.json  Review in selected format
+    <basename>.diff                  Unified diff (text/json, or with --diff)
+    <basename>.msg                   Commit message (text/json, or with --diff)
+
+After review:
+    git apply <basename>.diff
+    git commit -sF <basename>.msg
+
+SMTP Configuration (from git config):
+    sendemail.smtpserver      SMTP server hostname
+    sendemail.smtpserverport  SMTP port (default: 587 for TLS, 465 for SSL)
+    sendemail.smtpuser        SMTP username
+    sendemail.smtpencryption  'tls' for STARTTLS, 'ssl' for SSL/TLS
+    sendemail.smtppass        SMTP password (or set SMTP_PASSWORD env var)
+
+Example git config:
+    git config --global sendemail.smtpserver smtp.gmail.com
+    git config --global sendemail.smtpserverport 587
+    git config --global sendemail.smtpuser yourname@gmail.com
+    git config --global sendemail.smtpencryption tls
+        """,
+    )
+
+    parser.add_argument(
+        "doc_files",
+        nargs="+",
+        metavar="doc_file",
+        help="Documentation file(s) to review",
+    )
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=8192,
+        help="Max tokens for response (default: 8192)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output-dir",
+        default=".",
+        help="Output directory for all output files (default: .)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-q",
+        "--quiet",
+        action="store_true",
+        help="Suppress review output to stdout (only write files)",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-d",
+        "--diff",
+        action="store_true",
+        help="Always produce .diff and .msg files (automatic for text/json)",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    # Validate all doc files exist before processing
+    doc_paths = []
+    for doc_file in args.doc_files:
+        doc_path = Path(doc_file)
+        if not doc_path.exists():
+            error(f"Documentation file not found: {doc_file}")
+        doc_paths.append((doc_file, doc_path))
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Read AGENTS.md once
+    agents_content = agents_path.read_text()
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    provider_name = config["name"]
+
+    # Process each file
+    num_files = len(doc_paths)
+    for file_idx, (doc_file, doc_path) in enumerate(doc_paths, 1):
+        if num_files > 1:
+            print(
+                f"\n{'=' * 60}",
+                file=sys.stderr,
+            )
+            print(
+                f"Processing file {file_idx}/{num_files}: {doc_file}",
+                file=sys.stderr,
+            )
+            print(
+                f"{'=' * 60}",
+                file=sys.stderr,
+            )
+
+        # Determine output filenames
+        doc_basename = doc_path.stem
+        diff_file = output_dir / f"{doc_basename}.diff"
+        msg_file = output_dir / f"{doc_basename}.msg"
+
+        # Get commit prefix
+        commit_prefix = get_commit_prefix(doc_file)
+
+        # Read doc content
+        doc_content = doc_path.read_text()
+
+        if args.verbose:
+            print("=== Request ===", file=sys.stderr)
+            print(f"Provider: {args.provider}", file=sys.stderr)
+            print(f"Model: {model}", file=sys.stderr)
+            print(f"Output format: {args.output_format}", file=sys.stderr)
+            print(f"AGENTS file: {args.agents}", file=sys.stderr)
+            print(f"Doc file: {doc_file}", file=sys.stderr)
+            print(f"Commit prefix: {commit_prefix}", file=sys.stderr)
+            print(f"Output dir: {args.output_dir}", file=sys.stderr)
+            if args.send_email:
+                print("Send email: yes", file=sys.stderr)
+                print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+                if args.cc_addrs:
+                    print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+                print(f"From: {from_addr}", file=sys.stderr)
+            print("===============", file=sys.stderr)
+
+        # Call API
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            args.output_format,
+            args.diff,
+            args.verbose,
+        )
+
+        if not review_text:
+            print(
+                f"Warning: No response received for {doc_file}",
+                file=sys.stderr,
+            )
+            continue
+
+        # Determine review output file
+        format_ext = FORMAT_EXTENSIONS[args.output_format]
+        review_file = output_dir / f"{doc_basename}{format_ext}"
+
+        # Determine if we should write diff/msg files
+        write_diff_msg = args.diff or args.output_format in ("text", "json")
+
+        # Extract commit message and diff first (before stripping markers)
+        commit_msg, diff = "", ""
+        if write_diff_msg:
+            if args.output_format == "json":
+                # Will extract from JSON below
+                pass
+            else:
+                # Parse from text format markers
+                commit_msg, diff = parse_review_text(review_text)
+
+        # For non-text formats with --diff, strip the markers from display output
+        display_text = review_text
+        if args.diff and args.output_format in ("markdown", "html"):
+            display_text = strip_diff_markers(review_text)
+
+        # Build formatted output text
+        if args.output_format == "text":
+            output_text = review_text
+        elif args.output_format == "json":
+            # Try to parse JSON response
+            try:
+                review_data = json.loads(review_text)
+            except json.JSONDecodeError:
+                print("Warning: Response is not valid JSON", file=sys.stderr)
+                review_data = {"raw_response": review_text}
+
+            # Extract diff/msg from JSON if present
+            if write_diff_msg:
+                if isinstance(review_data, dict) and "raw_response" not in review_data:
+                    commit_msg = review_data.get("commit_message", "")
+                    diff = review_data.get("diff", "")
+
+            # Add metadata
+            output_data = {
+                "metadata": {
+                    "doc_file": doc_file,
+                    "provider": args.provider,
+                    "provider_name": provider_name,
+                    "model": model,
+                    "commit_prefix": commit_prefix,
+                },
+                "review": review_data,
+            }
+            output_text = json.dumps(output_data, indent=2)
+        elif args.output_format == "markdown":
+            output_text = f"""# Documentation Review: {doc_path.name}
+
+*Reviewed by {provider_name} ({model})*
+
+{display_text}
+"""
+        elif args.output_format == "html":
+            output_text = f"""<!DOCTYPE html>
+<html>
+<head>
+<meta charset="utf-8">
+<title>Review: {doc_path.name}</title>
+<style>
+body {{ font-family: system-ui, sans-serif; max-width: 900px; margin: 2em auto; padding: 0 1em; }}
+h1 {{ color: #333; }}
+.review-meta {{ color: #666; font-style: italic; }}
+pre {{ background: #f5f5f5; padding: 1em; overflow-x: auto; }}
+</style>
+</head>
+<body>
+<h1>Documentation Review: {doc_path.name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+<div class="review-content">
+{display_text}
+</div>
+</body>
+</html>
+"""
+
+        # Write formatted review to file
+        review_file.write_text(output_text)
+        print(f"Review written to: {review_file}", file=sys.stderr)
+
+        # Write diff/msg files
+        if write_diff_msg:
+            if commit_msg:
+                msg_file.write_text(commit_msg + "\n")
+                print(f"Commit message written to: {msg_file}", file=sys.stderr)
+            else:
+                msg_file.write_text("# No commit message generated\n")
+                print("Warning: Could not extract commit message", file=sys.stderr)
+
+            if diff:
+                diff_file.write_text(diff + "\n")
+                print(f"Diff written to: {diff_file}", file=sys.stderr)
+            else:
+                diff_file.write_text("# No changes suggested\n")
+                print("Warning: Could not extract diff", file=sys.stderr)
+
+        # Print to stdout unless quiet (or multiple files without verbose)
+        show_stdout = not args.quiet and (num_files == 1 or args.verbose)
+        if show_stdout:
+            print(
+                f"\n=== Documentation Review: {doc_path.name} "
+                f"(via {provider_name}) ==="
+            )
+            print(output_text)
+
+            # Print usage instructions for text format
+            if args.output_format == "text":
+                print("\n=== Output Files ===")
+                print(f"Commit message: {msg_file}")
+                print(f"Diff file:      {diff_file}")
+                print("\nTo apply changes:")
+                print(f"  git apply {diff_file}")
+                print(f"  git commit -sF {msg_file}")
+
+        # Send email if requested
+        if args.send_email:
+            if args.output_format != "text":
+                print(
+                    f"Note: Email will be sent as plain text regardless of "
+                    f"--format={args.output_format}",
+                    file=sys.stderr,
+                )
+
+            review_subject = f"[REVIEW] {commit_prefix} {doc_path.name}"
+
+            # Build email body
+            email_body = f"""AI-generated documentation review of {doc_file}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+            if args.verbose:
+                print("", file=sys.stderr)
+                print("=== Email Details ===", file=sys.stderr)
+                print(f"Subject: {review_subject}", file=sys.stderr)
+                print("=====================", file=sys.stderr)
+
+            send_email(
+                args.to_addrs,
+                args.cc_addrs,
+                from_addr,
+                review_subject,
+                None,
+                email_body,
+                args.dry_run,
+                args.verbose,
+            )
+
+            if not args.dry_run:
+                print("", file=sys.stderr)
+                print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+    # Print summary for multiple files
+    if num_files > 1:
+        print(f"\n{'=' * 60}", file=sys.stderr)
+        print(f"Processed {num_files} files", file=sys.stderr)
+        print(f"Output directory: {output_dir}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v8 5/6] doc: add AI-assisted patch review to contributing guide
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (3 preceding siblings ...)
  2026-02-09 19:48     ` [PATCH v8 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-02-09 19:48     ` Stephen Hemminger
  2026-02-09 19:48     ` [PATCH v8 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-09 19:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a new section to the contributing guide describing the
analyze-patch.py script which uses AI providers to review patches
against DPDK coding standards before submission to the mailing list.

The new section covers basic usage, provider selection, patch series
handling, LTS release review, and output format options. A note
clarifies that AI review supplements but does not replace human
review.

Also add a reference to the script in the new driver guide's
test tools checklist.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/contributing/new_driver.rst |  2 +
 doc/guides/contributing/patches.rst    | 56 ++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)

diff --git a/doc/guides/contributing/new_driver.rst b/doc/guides/contributing/new_driver.rst
index 555e875329..6c0d356cfd 100644
--- a/doc/guides/contributing/new_driver.rst
+++ b/doc/guides/contributing/new_driver.rst
@@ -210,3 +210,5 @@ Be sure to run the following test tools per patch in a patch series:
 * `check-doc-vs-code.sh`
 * `check-spdx-tag.sh`
 * Build documentation and validate how output looks
+* Optionally run ``analyze-patch.py`` for AI-assisted review
+  (see :ref:`ai_assisted_review` in the Contributing Guide)
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index 5f554d47e6..74fc714d16 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -503,6 +503,62 @@ Additionally, when contributing to the DTS tool, patches should also be checked
 the ``dts-check-format.sh`` script in the ``devtools`` directory of the DPDK repo.
 To run the script, extra :ref:`Python dependencies <dts_deps>` are needed.
 
+
+.. _ai_assisted_review:
+
+AI-Assisted Patch Review
+------------------------
+
+Contributors may optionally use the ``devtools/analyze-patch.py`` script
+to get an AI-assisted review of patches before submitting them to the mailing list.
+The script checks patches against the DPDK coding standards and contribution
+guidelines documented in ``AGENTS.md``.
+
+The script supports multiple AI providers (Anthropic Claude, OpenAI ChatGPT,
+xAI Grok, Google Gemini).  An API key for the chosen provider must be set
+in the corresponding environment variable (see ``--list-providers``).
+
+Basic usage::
+
+   # Review a single patch (default provider: Anthropic Claude)
+   devtools/analyze-patch.py my-patch.patch
+
+   # Use a different provider
+   devtools/analyze-patch.py -p openai my-patch.patch
+
+   # Review for an LTS branch (enables stricter rules)
+   devtools/analyze-patch.py -r 24.11 my-patch.patch
+
+   # List available providers and their API key variables
+   devtools/analyze-patch.py --list-providers
+
+For a patch series in an mbox file, the ``--split-patches`` option reviews
+each patch individually::
+
+   devtools/analyze-patch.py --split-patches series.mbox
+
+   # Review only a range of patches
+   devtools/analyze-patch.py --split-patches --patch-range 1-5 series.mbox
+
+When reviewing for a Long Term Stable (LTS) release, use the ``-r`` option
+with the target version.  Any DPDK release with minor version ``.11``
+(e.g., 23.11, 24.11) is automatically recognized as LTS,
+and the script will enforce stricter rules: bug fixes only, no new features or APIs.
+
+Output can be formatted as plain text (default), Markdown, HTML, or JSON::
+
+   devtools/analyze-patch.py -f markdown -o review.md my-patch.patch
+
+The review guidelines in ``AGENTS.md`` cover commit message formatting,
+SPDX licensing, C coding style, forbidden API usage, symbol export rules,
+and other DPDK-specific requirements.
+
+.. note::
+
+   AI-assisted review is a supplement to, not a replacement for,
+   human review on the mailing list.
+   Always verify AI suggestions before acting on them.
+
 .. _contrib_check_compilation:
 
 Checking Compilation
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v8 6/6] MAINTAINERS: add section for AI review tools
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (4 preceding siblings ...)
  2026-02-09 19:48     ` [PATCH v8 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
@ 2026-02-09 19:48     ` Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-09 19:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Thomas Monjalon

Add maintainer entries for the AI-assisted code review tooling:
AGENTS.md, analyze-patch.py, compare-reviews.sh, and
review-doc.py.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5683b87e4a..a30f08974e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -109,6 +109,14 @@ F: license/
 F: .editorconfig
 F: .mailmap
 
+AI review tools
+M: Stephen Hemminger <stephen@networkplumber.org>
+M: Aaron Conole <aconole@redhat.com>
+F: AGENTS.md
+F: devtools/analyze-patch.py
+F: devtools/compare-reviews.sh
+F: devtools/review-doc.py
+
 Linux kernel uAPI headers
 M: Maxime Coquelin <maxime.coquelin@redhat.com>
 F: devtools/linux-uapi.sh
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 3/6] devtools: add compare-reviews.sh for multi-provider analysis
  2026-02-19 17:48 ` [PATCH v10 0/6] " Stephen Hemminger
@ 2026-02-19 17:48   ` Stephen Hemminger
  0 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-02-19 17:48 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 192 ++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100755 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100755
index 0000000000..a63eeffb71
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,192 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -e
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+VERBOSE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            FORMAT="$2"
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"
+        echo ""
+        echo "Saved to: $OUTPUT_FILE"
+    else
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE"
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 0/6] add AGENTS.md and scripts for AI code review
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
                     ` (4 preceding siblings ...)
  2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-03-04 17:59   ` Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
                       ` (5 more replies)
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
                     ` (3 subsequent siblings)
  9 siblings, 6 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-04 17:59 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add guidelines and tooling for AI-assisted code review of DPDK
patches.

AGENTS.md provides a two-tier review framework: correctness bugs
(resource leaks, use-after-free, race conditions) are reported at
>=50% confidence; style issues require >80% with false positive
suppression. Mechanical checks handled by checkpatches.sh are
excluded to avoid redundant findings.

The analyze-patch.py script supports multiple AI providers
(Anthropic, OpenAI, xAI, Google) with mbox splitting, prompt
caching, and direct SMTP sending.

v9 - update AGENTS to reduce false positives
   - remove commit message/SPDX items from prompt (checkpatch's job).
   - update contributing guide text to match actual AGENTS.md coverage.

Stephen Hemminger (6):
  doc: add AGENTS.md for AI code review tools
  devtools: add multi-provider AI patch review script
  devtools: add compare-reviews.sh for multi-provider analysis
  devtools: add multi-provider AI documentation review script
  doc: add AI-assisted patch review to contributing guide
  MAINTAINERS: add section for AI review tools

 AGENTS.md                              | 1917 ++++++++++++++++++++++++
 MAINTAINERS                            |    8 +
 devtools/analyze-patch.py              | 1348 +++++++++++++++++
 devtools/compare-reviews.sh            |  192 +++
 devtools/review-doc.py                 | 1099 ++++++++++++++
 doc/guides/contributing/new_driver.rst |    2 +
 doc/guides/contributing/patches.rst    |   59 +
 7 files changed, 4625 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 devtools/analyze-patch.py
 create mode 100755 devtools/compare-reviews.sh
 create mode 100755 devtools/review-doc.py

-- 
2.51.0


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v9 1/6] doc: add AGENTS.md for AI code review tools
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-03-04 17:59     ` Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-04 17:59 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Provide structured guidelines for AI tools reviewing DPDK
patches. Focuses on correctness bug detection (resource leaks,
use-after-free, race conditions), C coding style, forbidden
tokens, API conventions, and severity classifications.

Mechanical checks already handled by checkpatches.sh (SPDX
format, commit message formatting, tag ordering) are excluded
to avoid redundant and potentially contradictory findings.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 AGENTS.md | 1917 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1917 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..6acd4e2f5d
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,1917 @@
+# AGENTS.md - DPDK Code Review Guidelines for AI Tools
+
+## CRITICAL INSTRUCTION - READ FIRST
+
+This document has two categories of review rules with different
+confidence thresholds:
+
+### 1. Correctness Bugs -- HIGHEST PRIORITY (report at >=50% confidence)
+
+**Always report potential correctness bugs.** These are the most
+valuable findings. When in doubt, report them with a note about
+your confidence level. A possible use-after-free or resource leak
+is worth mentioning even if you are not certain.
+
+Correctness bugs include:
+- Use-after-free (accessing memory after `free`/`rte_free`)
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference
+- Buffer overflows or out-of-bounds access
+- Uninitialized variable use in a reachable code path
+- Race conditions (unsynchronized shared state)
+- `volatile` used instead of atomic operations for inter-thread shared variables
+- `__atomic_load_n()`/`__atomic_store_n()`/`__atomic_*()` GCC built-ins instead of `rte_atomic_*_explicit()`
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` legacy barriers instead of `rte_atomic_thread_fence()`
+- Missing error checks on functions that can fail
+- Error paths that skip cleanup (goto labels, missing free/close)
+- Incorrect error propagation (wrong return value, lost errno)
+- Logic errors in conditionals (wrong operator, inverted test)
+- Integer overflow/truncation in size calculations
+- Missing bounds checks on user-supplied sizes or indices
+- `mmap()` return checked against `NULL` instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=`
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied data
+- `1 << n` on 64-bit bitmask (must use `1ULL << n` or `RTE_BIT64()`)
+- Variable assigned then overwritten before being read (dead store)
+- Same variable used as loop counter in nested loops
+- `memcpy`/`memcmp`/`memset` with same pointer for source and destination (no-op or undefined)
+- `rte_pktmbuf_free_bulk()` called on mbufs that may originate from different mempools (Tx burst, ring dequeue)
+
+**Do NOT self-censor correctness bugs.** If you identify a code
+path where a resource could leak or memory could be used after
+free, report it. Do not talk yourself out of it.
+
+### 2. Style, Process, and Formatting -- suppress false positives
+
+**NEVER list a style/process item under "Errors" or "Warnings" if
+you conclude it is correct.**
+
+Before outputting any style, formatting, or process error/warning,
+verify it is actually wrong. If your analysis concludes with
+phrases like "there's no issue here", "which is fine", "appears
+correct", "is acceptable", or "this is actually correct" -- then
+DO NOT INCLUDE IT IN YOUR OUTPUT AT ALL. Delete it. Omit it
+entirely.
+
+This suppression rule applies to: naming conventions,
+code style, and process compliance. It does NOT apply to
+correctness bugs listed above. (SPDX/copyright format and
+commit message formatting are handled by checkpatch and are
+excluded from AI review entirely.)
+
+---
+
+This document provides guidelines for AI-powered code review tools
+when reviewing contributions to the Data Plane Development Kit
+(DPDK). It is derived from the official DPDK contributor guidelines
+and validation scripts.
+
+## Overview
+
+DPDK follows a development process modeled on the Linux Kernel. All
+patches are reviewed publicly on the mailing list before being
+merged. AI review tools should verify compliance with the standards
+outlined below.
+
+## Review Philosophy
+
+**Correctness bugs are the primary goal of AI review.** Style and
+formatting checks are secondary. A review that catches a
+use-after-free but misses a style nit is far more valuable than
+one that catches every style issue but misses the bug.
+
+**BEFORE OUTPUTTING YOUR REVIEW**: Re-read each item.
+- For correctness bugs: keep them. If you have reasonable doubt
+  that a code path is safe, report it.
+- For style/process items: if ANY item contains phrases like "is
+  fine", "no issue", "appears correct", "is acceptable",
+  "actually correct" -- DELETE THAT ITEM. Do not include it.
+
+### Correctness review guidelines
+- Trace error paths: for every function that allocates a resource
+  or acquires a lock, verify that ALL error paths after that point
+  release it
+- Check every `goto error` and early `return`: does it clean up
+  everything allocated so far?
+- Look for use-after-free: after `free(p)`, is `p` accessed again?
+- Check that error codes are propagated, not silently dropped
+- Report at >=50% confidence; note uncertainty if appropriate
+- It is better to report a potential bug that turns out to be safe
+  than to miss a real bug
+
+### Style and process review guidelines
+- Only comment on style/process issues when you have HIGH CONFIDENCE (>80%) that an issue exists
+- Be concise: one sentence per comment when possible
+- Focus on actionable feedback, not observations
+- When reviewing text, only comment on clarity issues if the text is genuinely
+  confusing or could lead to errors.
+- Do NOT comment on copyright years, SPDX format, or copyright holders - not subject to AI review
+- Do NOT report an issue then contradict yourself - if something is acceptable, do not mention it at all
+- Do NOT include items in Errors/Warnings that you then say are "acceptable" or "correct"
+- Do NOT mention things that are correct or "not an issue" - only report actual problems
+- Do NOT speculate about contributor circumstances (employment, company policies, etc.)
+- Before adding any style item to your review, ask: "Is this actually wrong?" If no, omit it entirely.
+- NEVER write "(Correction: ...)" - if you need to correct yourself, simply omit the item entirely
+- Do NOT add vague suggestions like "should be verified" or "should be checked" - either it's wrong or don't mention it
+- Do NOT flag something as an Error then say "which is correct" in the same item
+- Do NOT say "no issue here" or "this is actually correct" - if there's no issue, do not include it in your review
+- Do NOT analyze cross-patch dependencies or compilation order - you cannot reliably determine this from patch review
+- Do NOT claim a patch "would cause compilation failure" based on symbols used in other patches in the series
+- Review each patch individually for its own correctness; assume the patch author ordered them correctly
+- When reviewing a patch series, OMIT patches that have no issues. Do not include a patch in your output just to say "no issues found" or to summarize what the patch does. Only include patches where you have actual findings to report.
+
+## Priority Areas (Review These)
+
+### Security & Safety
+- Unsafe code blocks without justification
+- Command injection risks (shell commands, user input)
+- Path traversal vulnerabilities
+- Credential exposure or hard coded secrets
+- Missing input validation on external data
+- Improper error handling that could leak sensitive info
+
+### Correctness Issues
+- Logic errors that could cause panics or incorrect behavior
+- Buffer overflows
+- Race conditions
+- **`volatile` for inter-thread synchronization**: `volatile` does not
+  provide atomicity or memory ordering between threads. Use
+  `rte_atomic_load_explicit()`/`rte_atomic_store_explicit()` with
+  appropriate `rte_memory_order_*` instead. See the Shared Variable
+  Access section under Forbidden Tokens for details.
+- Resource leaks (files, connections, memory)
+- Off-by-one errors or boundary conditions
+- Incorrect error propagation
+- **Use-after-free** (any access to memory after it has been freed)
+- **Error path resource leaks**: For every allocation or fd open,
+  trace each error path (`goto`, early `return`, conditional) to
+  verify the resource is released. Common patterns to check:
+  - `malloc`/`rte_malloc` followed by a failure that does `return -1`
+    instead of `goto cleanup`
+  - `open()`/`socket()` fd not closed on a later error
+  - Lock acquired but not released on an error branch
+  - Partially initialized structure where early fields are allocated
+    but later allocation fails without freeing the early ones
+- **Double-free / double-close**: resource freed in both a normal
+  path and an error path, or fd closed but not set to -1 allowing
+  a second close
+- **Missing error checks**: functions that can fail (malloc, open,
+  ioctl, etc.) whose return value is not checked
+- Changes to API without release notes
+- Changes to ABI on non-LTS release
+- Usage of deprecated APIs when replacements exist
+- Overly defensive code that adds unnecessary checks
+- Unnecessary comments that just restate what the code already shows (remove them)
+- **Process-shared synchronization errors** (pthread mutexes in shared memory without `PTHREAD_PROCESS_SHARED`)
+- **`mmap()` checked against NULL instead of `MAP_FAILED`**: `mmap()` returns
+  `MAP_FAILED` (i.e., `(void *)-1`) on failure, NOT `NULL`. Checking
+  `== NULL` or `!= NULL` will miss the error and use an invalid pointer.
+  ```c
+  /* BAD - mmap never returns NULL on failure */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == NULL)       /* WRONG - will not catch MAP_FAILED */
+      return -1;
+
+  /* GOOD */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == MAP_FAILED)
+      return -1;
+  ```
+- **Statistics accumulation using `=` instead of `+=`**: When accumulating
+  statistics (counters, byte totals, packet counts), using `=` overwrites
+  the running total with only the latest value. This silently produces
+  wrong results.
+  ```c
+  /* BAD - overwrites instead of accumulating */
+  stats->rx_packets = nb_rx;
+  stats->rx_bytes = total_bytes;
+
+  /* GOOD - accumulates over time */
+  stats->rx_packets += nb_rx;
+  stats->rx_bytes += total_bytes;
+  ```
+  Note: `=` is correct for gauge-type values (e.g., queue depth, link
+  status) and for initial assignment. Only flag when the context is
+  clearly incremental accumulation (loop bodies, per-burst counters,
+  callback tallies).
+- **Integer multiply without widening cast**: When multiplying integers
+  to produce a result wider than the operands (sizes, offsets, byte
+  counts), the multiplication is performed at the operand width and
+  the upper bits are silently lost before the assignment. This applies
+  to any narrowing scenario: 16×16 assigned to a 32-bit variable,
+  32×32 assigned to a 64-bit variable, etc.
+  ```c
+  /* BAD - 32×32 overflows before widening to 64 */
+  uint64_t total_size = num_entries * entry_size;  /* both are uint32_t */
+  size_t offset = ring->idx * ring->desc_size;     /* 32×32 → truncated */
+
+  /* BAD - 16×16 overflows before widening to 32 */
+  uint32_t byte_count = pkt_len * nb_segs;         /* both are uint16_t */
+
+  /* GOOD - widen before multiply */
+  uint64_t total_size = (uint64_t)num_entries * entry_size;
+  size_t offset = (size_t)ring->idx * ring->desc_size;
+  uint32_t byte_count = (uint32_t)pkt_len * nb_segs;
+  ```
+- **Unbounded descriptor chain traversal**: When walking a chain of
+  descriptors (virtio, DMA, NIC Rx/Tx rings) where the chain length
+  or next-index comes from guest memory or an untrusted API caller,
+  the traversal MUST have a bounds check or loop counter to prevent
+  infinite loops or out-of-bounds access from malicious/corrupt data.
+  ```c
+  /* BAD - guest controls desc[idx].next with no bound */
+  while (desc[idx].flags & VRING_DESC_F_NEXT) {
+      idx = desc[idx].next;          /* guest-supplied, unbounded */
+      process(desc[idx]);
+  }
+
+  /* GOOD - cap iterations to descriptor ring size */
+  for (i = 0; i < ring_size; i++) {
+      if (!(desc[idx].flags & VRING_DESC_F_NEXT))
+          break;
+      idx = desc[idx].next;
+      if (idx >= ring_size)          /* bounds check */
+          return -EINVAL;
+      process(desc[idx]);
+  }
+  ```
+  This applies to any chain/linked-list traversal where indices or
+  pointers originate from untrusted input (guest VMs, user-space
+  callers, network packets).
+- **Bitmask shift using `1` instead of `1ULL` on 64-bit masks**: The
+  literal `1` is `int` (32 bits). Shifting it by 32 or more is
+  undefined behavior; shifting it by less than 32 but assigning to a
+  `uint64_t` silently zeroes the upper 32 bits. Use `1ULL << n`,
+  `UINT64_C(1) << n`, or the DPDK `RTE_BIT64(n)` macro.
+  ```c
+  /* BAD - 1 is int, UB if n >= 32, wrong if result used as uint64_t */
+  uint64_t mask = 1 << bit_pos;
+  if (features & (1 << VIRTIO_NET_F_MRG_RXBUF))  /* bit 15 OK, bit 32+ UB */
+
+  /* GOOD */
+  uint64_t mask = UINT64_C(1) << bit_pos;
+  uint64_t mask = 1ULL << bit_pos;
+  uint64_t mask = RTE_BIT64(bit_pos);        /* preferred in DPDK */
+  if (features & RTE_BIT64(VIRTIO_NET_F_MRG_RXBUF))
+  ```
+  Note: `1U << n` is acceptable when the mask is known to be 32-bit
+  (e.g., `uint32_t` register fields with `n < 32`). Only flag when
+  the result is stored in, compared against, or returned as a 64-bit
+  type, or when `n` could be >= 32.
+- **Variable overwrite before read (dead store)**: A variable is
+  assigned a value that is unconditionally overwritten before it is
+  ever read. This usually indicates a logic error (wrong variable
+  name, missing `if`, copy-paste mistake) or at minimum is dead code.
+  ```c
+  /* BAD - first assignment is never read */
+  ret = validate_input(cfg);
+  ret = apply_config(cfg);     /* overwrites without checking first ret */
+  if (ret != 0)
+      return ret;
+
+  /* GOOD - check each return value */
+  ret = validate_input(cfg);
+  if (ret != 0)
+      return ret;
+  ret = apply_config(cfg);
+  if (ret != 0)
+      return ret;
+  ```
+  Do NOT flag cases where the initial value is intentionally a default
+  that may or may not be overwritten (e.g., `int ret = 0;` followed
+  by a conditional assignment). Only flag unconditional overwrites
+  where the first value can never be observed.
+- **Shared loop counter in nested loops**: Using the same variable as
+  the loop counter in both an outer and inner loop causes the outer
+  loop to malfunction because the inner loop modifies its counter.
+  ```c
+  /* BAD - inner loop clobbers outer loop counter */
+  int i;
+  for (i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (i = 0; i < nb_descs; i++)    /* BUG: reuses i */
+          init_desc(i);
+  }
+
+  /* GOOD - distinct loop counters */
+  for (int i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (int j = 0; j < nb_descs; j++)
+          init_desc(j);
+  }
+  ```
+- **`memcpy`/`memcmp`/`memset` self-argument (same pointer as both
+  operands)**: Passing the same pointer as both source and destination
+  to `memcpy()` is undefined behavior per C99. Passing the same
+  pointer to both arguments of `memcmp()` is a no-op that always
+  returns 0, indicating a logic error (usually a copy-paste mistake
+  with the wrong variable name). The same applies to `rte_memcpy()`
+  and `memmove()` with identical arguments.
+  ```c
+  /* BAD - memcpy with same src and dst is undefined behavior */
+  memcpy(buf, buf, len);
+  rte_memcpy(dst, dst, len);
+
+  /* BAD - memcmp with same pointer always returns 0 (logic error) */
+  if (memcmp(key, key, KEY_LEN) == 0)  /* always true, wrong variable? */
+
+  /* BAD - likely copy-paste: should be comparing two different MACs */
+  if (memcmp(&eth->src_addr, &eth->src_addr, RTE_ETHER_ADDR_LEN) == 0)
+
+  /* GOOD - comparing two different things */
+  memcpy(dst, src, len);
+  if (memcmp(&eth->src_addr, &eth->dst_addr, RTE_ETHER_ADDR_LEN) == 0)
+  ```
+  This pattern almost always indicates a copy-paste bug where one of
+  the arguments should be a different variable.
+- **`rte_pktmbuf_free_bulk()` on mixed-pool mbuf arrays**: Tx burst functions
+  and ring/queue dequeue paths receive mbufs that may originate from different
+  mempools (applications are free to send mbufs from any pool).
+  `rte_pktmbuf_free_bulk()` returns ALL mbufs to the pool of the first mbuf
+  in the array. If mbufs come from different pools, subsequent mbufs are
+  returned to the wrong pool, corrupting pool accounting and causing
+  hard-to-debug failures.
+  ```c
+  /* BAD - assumes all mbufs are from the same pool */
+  /* (in tx_burst completion or ring dequeue error path) */
+  rte_pktmbuf_free_bulk(mbufs, nb_mbufs);
+
+  /* GOOD - free individually (each mbuf returned to its own pool) */
+  for (i = 0; i < nb_mbufs; i++)
+      rte_pktmbuf_free(mbufs[i]);
+
+  /* GOOD - batch by pool if performance matters */
+  /* group mbufs by pool, then call rte_mempool_put_bulk per group */
+  ```
+  This applies to any path that frees mbufs submitted by the application:
+  Tx completion, Tx error cleanup, and ring/queue drain paths. Rx burst
+  functions that allocate all mbufs from a single pool are not affected.
+
+### Architecture & Patterns
+- Code that violates existing patterns in the code base
+- Missing error handling
+- Code that is not safe against signals
+
+### New Library API Design
+
+When a patch adds a new library under `lib/`, review API design in
+addition to correctness and style.
+
+**API boundary.** A library should be a compiler, not a framework.
+The model is `rte_acl`: create a context, feed input, get structured
+output, caller decides what to do with it. No callbacks needed. If
+the library requires callers to implement a callback table to
+function, the boundary is wrong — the library is asking the caller
+to be its backend.
+
+**Callback structs** (Warning / Error). Any function-pointer struct
+in an installed header is an ABI break waiting to happen. Adding or
+reordering a member breaks all consumers.
+- Prefer a single callback parameter over an ops table.
+- \>5 callbacks: **Warning** — likely needs redesign.
+- \>20 callbacks: **Error** — this is an app plugin API, not a library.
+- All callbacks must have Doxygen (contract, return values, ownership).
+- Void-returning callbacks for failable operations swallow errors —
+  flag as **Error**.
+- Callbacks serving app-specific needs (e.g. `verbose_level_get`)
+  indicate wrong code was extracted into the library.
+
+**Extensible structures.** Prefer TLV / tagged-array patterns over
+enum + union, following `rte_flow_item` and `rte_flow_action` as
+the model. Type tag + pointer to type-specific data allows adding
+types without ABI breaks. Flag as **Warning**:
+- Large enums (100+) consumers must switch on.
+- Unions that grow with every new feature.
+- Ask: "What changes when a feature is added next release?" If
+  "add an enum value and union arm" — should be TLV.
+
+**Installed headers.** If it's in `headers` or `indirect_headers`
+in meson.build, it's public API. Don't call it "private." If truly
+internal, don't install it.
+
+**Global state.** Prefer handle-based APIs (`create`/`destroy`)
+over singletons. `rte_acl` allows multiple independent classifier
+instances; new libraries should do the same.
+
+**Output ownership.** Prefer caller-allocated or library-allocated-
+caller-freed over internal static buffers. If static buffers are
+used, document lifetime and ensure Doxygen examples don't show
+stale-pointer usage.
+
+---
+
+## C Coding Style
+
+### General Formatting
+
+- **Tab width**: 8 characters (hard tabs for indentation, spaces for alignment)
+- **No trailing whitespace** on lines or at end of files
+- Files must end with a new line
+- Code style should be consistent within each file
+
+
+### Comments
+
+```c
+/* Most single-line comments look like this. */
+
+/*
+ * VERY important single-line comments look like this.
+ */
+
+/*
+ * Multi-line comments look like this. Make them real sentences. Fill
+ * them so they look like real paragraphs.
+ */
+```
+
+### Header File Organization
+
+Include order (each group separated by blank line):
+1. System/libc includes
+2. DPDK EAL includes
+3. DPDK misc library includes
+4. Application-specific includes
+
+```c
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <rte_eal.h>
+
+#include <rte_ring.h>
+#include <rte_mempool.h>
+
+#include "application.h"
+```
+
+### Header Guards
+
+```c
+#ifndef _FILE_H_
+#define _FILE_H_
+
+/* Code */
+
+#endif /* _FILE_H_ */
+```
+
+### Naming Conventions
+
+- **All external symbols** must have `RTE_` or `rte_` prefix
+- **Macros**: ALL_UPPERCASE with `RTE_` prefix
+- **Functions**: lowercase with underscores only (no CamelCase)
+- **Variables**: lowercase with underscores only
+- **Enum values**: ALL_UPPERCASE with `RTE_<ENUM>_` prefix
+
+**Exception**: Driver base directories (`drivers/*/base/`) may use different
+naming conventions when sharing code across platforms or with upstream vendor code.
+
+#### Symbol Naming for Static Linking
+
+Drivers and libraries must not expose global variables that could
+clash when statically linked with other DPDK components or
+applications. Use consistent and unique prefixes for all exported
+symbols to avoid namespace collisions.
+
+**Good practice**: Use a driver-specific or library-specific prefix for all global variables:
+
+```c
+/* Good - virtio driver uses consistent "virtio_" prefix */
+const struct virtio_ops virtio_legacy_ops = {
+	.read = virtio_legacy_read,
+	.write = virtio_legacy_write,
+	.configure = virtio_legacy_configure,
+};
+
+const struct virtio_ops virtio_modern_ops = {
+	.read = virtio_modern_read,
+	.write = virtio_modern_write,
+	.configure = virtio_modern_configure,
+};
+
+/* Good - mlx5 driver uses consistent "mlx5_" prefix */
+struct mlx5_flow_driver_ops mlx5_flow_dv_ops;
+```
+
+**Bad practice**: Generic names that may clash:
+
+```c
+/* Bad - "ops" is too generic, will clash with other drivers */
+const struct virtio_ops ops = { ... };
+
+/* Bad - "legacy_ops" could clash with other legacy implementations */
+const struct virtio_ops legacy_ops = { ... };
+
+/* Bad - "driver_config" is not unique */
+struct driver_config config;
+```
+
+**Guidelines**:
+- Prefix all global variables with the driver or library name (e.g., `virtio_`, `mlx5_`, `ixgbe_`)
+- Prefix all global functions similarly unless they use the `rte_` namespace
+- Internal static variables do not require prefixes as they have file scope
+- Consider using the `RTE_` or `rte_` prefix only for symbols that are part of the public DPDK API
+
+#### Prohibited Terminology
+
+Do not use non-inclusive naming including:
+- `master/slave` -> Use: primary/secondary, controller/worker, leader/follower
+- `blacklist/whitelist` -> Use: denylist/allowlist, blocklist/passlist
+- `cripple` -> Use: impacted, degraded, restricted, immobilized
+- `tribe` -> Use: team, squad
+- `sanity check` -> Use: coherence check, test, verification
+
+
+### Comparisons and Boolean Logic
+
+```c
+/* Pointers - compare explicitly with NULL */
+if (p == NULL)      /* Good */
+if (p != NULL)      /* Good */
+if (likely(p != NULL))   /* Good - likely/unlikely don't change this */
+if (unlikely(p == NULL)) /* Good - likely/unlikely don't change this */
+if (!p)             /* Bad - don't use ! on pointers */
+
+/* Integers - compare explicitly with zero */
+if (a == 0)         /* Good */
+if (a != 0)         /* Good */
+if (errno != 0)     /* Good - this IS explicit */
+if (likely(a != 0)) /* Good - likely/unlikely don't change this */
+if (!a)             /* Bad - don't use ! on integers */
+if (a)              /* Bad - implicit, should be a != 0 */
+
+/* Characters - compare with character constant */
+if (*p == '\0')     /* Good */
+
+/* Booleans - direct test is acceptable */
+if (flag)           /* Good for actual bool types */
+if (!flag)          /* Good for actual bool types */
+```
+
+**Explicit comparison** means using `==` or `!=` operators (e.g., `x != 0`, `p == NULL`).
+**Implicit comparison** means relying on truthiness without an operator (e.g., `if (x)`, `if (!p)`).
+**Note**: `likely()` and `unlikely()` macros do NOT affect whether a comparison is explicit or implicit.
+
+### Boolean Usage
+
+Prefer `bool` (from `<stdbool.h>`) over `int` for variables,
+parameters, and return values that are purely true/false. Using
+`bool` makes intent explicit, enables compiler diagnostics for
+misuse, and is self-documenting.
+
+```c
+/* Bad - int used as boolean flag */
+int verbose = 0;
+int is_enabled = 1;
+
+int
+check_valid(struct item *item)
+{
+	if (item->flags & ITEM_VALID)
+		return 1;
+	return 0;
+}
+
+/* Good - bool communicates intent */
+bool verbose = false;
+bool is_enabled = true;
+
+bool
+check_valid(struct item *item)
+{
+	return item->flags & ITEM_VALID;
+}
+```
+
+**Guidelines:**
+- Use `bool` for variables that only hold true/false values
+- Use `bool` return type for predicate functions (functions that
+  answer a yes/no question, often named `is_*`, `has_*`, `can_*`)
+- Use `true`/`false` rather than `1`/`0` for boolean assignments
+- Boolean variables and parameters should not use explicit
+  comparison: `if (verbose)` is correct, not `if (verbose == true)`
+- `int` is still appropriate when a value can be negative, is an
+  error code, or carries more than two states
+
+**Structure fields:**
+- `bool` occupies 1 byte. In packed or cache-critical structures,
+  consider using a bitfield or flags word instead
+- For configuration structures and non-hot-path data, `bool` is
+  preferred over `int` for flag fields
+
+```c
+/* Bad - int flags waste space and obscure intent */
+struct port_config {
+	int promiscuous;     /* 0 or 1 */
+	int link_up;         /* 0 or 1 */
+	int autoneg;         /* 0 or 1 */
+	uint16_t mtu;
+};
+
+/* Good - bool for flag fields */
+struct port_config {
+	bool promiscuous;
+	bool link_up;
+	bool autoneg;
+	uint16_t mtu;
+};
+
+/* Also good - bitfield for cache-critical structures */
+struct fast_path_config {
+	uint32_t flags;      /* bitmask of CONFIG_F_* */
+	/* ... hot-path fields ... */
+};
+```
+
+**Do NOT flag:**
+- `int` return type for functions that return error codes (0 for
+  success, negative for error) — these are NOT boolean
+- `int` used for tri-state or multi-state values
+- `int` flags in existing code where changing the type would be a
+  large, unrelated refactor
+- Bitfield or flags-word approaches in performance-critical
+  structures
+
+### Indentation and Braces
+
+```c
+/* Control statements - no braces for single statements */
+if (val != NULL)
+	val = realloc(val, newsize);
+
+/* Braces on same line as else */
+if (test)
+	stmt;
+else if (bar) {
+	stmt;
+	stmt;
+} else
+	stmt;
+
+/* Switch statements - don't indent case */
+switch (ch) {
+case 'a':
+	aflag = 1;
+	/* FALLTHROUGH */
+case 'b':
+	bflag = 1;
+	break;
+default:
+	usage();
+}
+
+/* Long conditions - double indent continuation */
+if (really_long_variable_name_1 == really_long_variable_name_2 &&
+		really_long_variable_name_3 == really_long_variable_name_4)
+	stmt;
+```
+
+### Variable Declarations
+
+- Prefer declaring variables inside the basic block where they are used
+- Variables may be declared either at the start of the block, or at point of first use (C99 style)
+- Both declaration styles are acceptable; consistency within a function is preferred
+- Initialize variables only when a meaningful value exists at declaration time
+- Use C99 designated initializers for structures
+
+```c
+/* Good - declaration at start of block */
+int ret;
+ret = some_function();
+
+/* Also good - declaration at point of use (C99 style) */
+for (int i = 0; i < count; i++)
+	process(i);
+
+/* Good - declaration in inner block where variable is used */
+if (condition) {
+	int local_val = compute();
+	use(local_val);
+}
+
+/* Bad - unnecessary initialization defeats compiler warnings */
+int ret = 0;
+ret = some_function();    /* Compiler won't warn if assignment removed */
+```
+
+### Function Format
+
+- Return type on its own line
+- Opening brace on its own line
+- Place an empty line between declarations and statements
+
+```c
+static char *
+function(int a1, int b1)
+{
+	char *p;
+
+	p = do_something(a1, b1);
+	return p;
+}
+```
+
+---
+
+## Unnecessary Code Patterns
+
+The following patterns add unnecessary code, hide bugs, or reduce performance. Avoid them.
+
+### Unnecessary Variable Initialization
+
+Do not initialize variables that will be assigned before use. This defeats the compiler's uninitialized variable warnings, hiding potential bugs.
+
+```c
+/* Bad - initialization defeats -Wuninitialized */
+int ret = 0;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - compiler will warn if any path misses assignment */
+int ret;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - meaningful initial value */
+int count = 0;
+for (i = 0; i < n; i++)
+	if (test(i))
+		count++;
+```
+
+### Unnecessary Casts of void *
+
+In C, `void *` converts implicitly to any pointer type. Casting the result of `malloc()`, `calloc()`, `rte_malloc()`, or similar functions is unnecessary and can hide the error of a missing `#include <stdlib.h>`.
+
+```c
+/* Bad - unnecessary cast */
+struct foo *p = (struct foo *)malloc(sizeof(*p));
+struct bar *q = (struct bar *)rte_malloc(NULL, sizeof(*q), 0);
+
+/* Good - no cast needed in C */
+struct foo *p = malloc(sizeof(*p));
+struct bar *q = rte_malloc(NULL, sizeof(*q), 0);
+```
+
+Note: Casts are required in C++ but DPDK is a C project.
+
+### Zero-Length Arrays vs Variable-Length Arrays
+
+Zero-length arrays (`int arr[0]`) are a GCC extension. Use C99 flexible array members instead.
+
+```c
+/* Bad - GCC extension */
+struct msg {
+	int len;
+	char data[0];
+};
+
+/* Good - C99 flexible array member */
+struct msg {
+	int len;
+	char data[];
+};
+```
+
+### Unnecessary NULL Checks Before free()
+
+Functions like `free()`, `rte_free()`, and similar deallocation functions accept NULL pointers safely. Do not add redundant NULL checks.
+
+```c
+/* Bad - unnecessary check */
+if (ptr != NULL)
+	free(ptr);
+
+if (rte_ptr != NULL)
+	rte_free(rte_ptr);
+
+/* Good - free handles NULL */
+free(ptr);
+rte_free(rte_ptr);
+```
+
+### memset Before free() (CWE-14)
+
+Do not call `memset()` to zero memory before freeing it. The compiler may optimize away the `memset()` as a dead store (CWE-14: Compiler Removal of Code to Clear Buffers). For security-sensitive data, use `explicit_bzero()`, `rte_memset_sensitive()`, or `rte_free_sensitive()` which the compiler is not permitted to eliminate.
+
+```c
+/* Bad - compiler may eliminate memset */
+memset(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for non-sensitive data, just free */
+free(ptr);
+
+/* Good - explicit_bzero cannot be optimized away */
+explicit_bzero(secret_key, sizeof(secret_key));
+free(secret_key);
+
+/* Good - DPDK wrapper for clearing sensitive data */
+rte_memset_sensitive(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for rte_malloc'd sensitive data, combined clear+free */
+rte_free_sensitive(secret_key);
+```
+
+### Appropriate Use of rte_malloc()
+
+`rte_malloc()` allocates from hugepage memory. Use it only when required:
+
+- Memory that will be accessed by DMA (NIC descriptors, packet buffers)
+- Memory shared between primary and secondary DPDK processes
+- Memory requiring specific NUMA node placement
+
+For general allocations, use standard `malloc()` which is faster and does not consume limited hugepage resources.
+
+```c
+/* Bad - rte_malloc for ordinary data structure */
+struct config *cfg = rte_malloc(NULL, sizeof(*cfg), 0);
+
+/* Good - standard malloc for control structures */
+struct config *cfg = malloc(sizeof(*cfg));
+
+/* Good - rte_malloc for DMA-accessible memory */
+struct rte_mbuf *mbufs = rte_malloc(NULL, n * sizeof(*mbufs), RTE_CACHE_LINE_SIZE);
+```
+
+### Appropriate Use of rte_memcpy()
+
+`rte_memcpy()` is optimized for bulk data transfer in the fast path. For general use, standard `memcpy()` is preferred because:
+
+- Modern compilers optimize `memcpy()` effectively
+- `memcpy()` includes bounds checking with `_FORTIFY_SOURCE`
+- `memcpy()` handles small fixed-size copies efficiently
+
+```c
+/* Bad - rte_memcpy in control path */
+rte_memcpy(&config, &default_config, sizeof(config));
+
+/* Good - standard memcpy for control path */
+memcpy(&config, &default_config, sizeof(config));
+
+/* Good - rte_memcpy for packet data in fast path */
+rte_memcpy(rte_pktmbuf_mtod(m, void *), payload, len);
+```
+
+### Non-const Function Pointer Arrays
+
+Arrays of function pointers (ops tables, dispatch tables, callback arrays)
+should be declared `const` when their contents are fixed at compile time.
+A non-`const` function pointer array can be overwritten by bugs or exploits,
+and prevents the compiler from placing the table in read-only memory.
+
+```c
+/* Bad - mutable when it doesn't need to be */
+static rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+
+/* Good - immutable dispatch table */
+static const rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+```
+
+**Exceptions** (do NOT flag):
+- Arrays modified at runtime for CPU feature detection or capability probing
+  (e.g., selecting a burst function based on `rte_cpu_get_flag_enabled()`)
+- Arrays containing mutable state (e.g., entries that are linked into lists)
+- Arrays populated dynamically via registration APIs
+- `dev_ops` or similar structures assigned per-device at init time
+
+Only flag when the array is fully initialized at declaration with constant
+values and never modified thereafter.
+
+---
+
+## Forbidden Tokens
+
+### Functions
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `rte_panic()` | Return error codes | lib/, drivers/ |
+| `rte_exit()` | Return error codes | lib/, drivers/ |
+| `perror()` | `RTE_LOG()` with `strerror(errno)` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `printf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `fprintf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+
+### Atomics and Memory Barriers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `rte_atomic16/32/64_xxx()` | C11 atomics via `rte_atomic_xxx()` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence()` |
+| `__sync_xxx()` | `rte_atomic_xxx()` |
+| `__atomic_xxx()` | `rte_atomic_xxx()` |
+| `__ATOMIC_RELAXED` etc. | `rte_memory_order_xxx` |
+| `__rte_atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+
+#### Shared Variable Access: volatile vs Atomics
+
+Variables shared between threads or between a thread and a signal
+handler **must** use atomic operations. The C `volatile` keyword is
+NOT a substitute for atomics — it prevents compiler optimization
+of accesses but provides no atomicity guarantees and no memory
+ordering between threads. On some architectures, `volatile` reads
+and writes may tear on unaligned or multi-word values.
+
+DPDK provides C11 atomic wrappers that are portable across all
+supported compilers and architectures. Always use these for shared
+state.
+
+**Reading shared variables:**
+
+```c
+/* BAD - volatile provides no atomicity or ordering guarantee */
+volatile int stop_flag;
+if (stop_flag)           /* data race, compiler/CPU can reorder */
+    return;
+
+/* BAD - direct access to shared variable without atomic */
+if (shared->running)     /* undefined behavior if another thread writes */
+    process();
+
+/* GOOD - DPDK C11 atomic wrapper */
+if (rte_atomic_load_explicit(&shared->stop_flag, rte_memory_order_acquire))
+    return;
+
+/* GOOD - relaxed is fine for statistics or polling a flag where
+ * you don't need to synchronize other memory accesses */
+count = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+```
+
+**Writing shared variables:**
+
+```c
+/* BAD - volatile write */
+volatile int *flag = &shared->ready;
+*flag = 1;
+
+/* GOOD - atomic store with appropriate ordering */
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+```
+
+**Read-modify-write operations:**
+
+```c
+/* BAD - not atomic even with volatile */
+volatile uint64_t *counter = &stats->packets;
+*counter += nb_rx;       /* TOCTOU: load, add, store is 3 operations */
+
+/* GOOD - atomic add */
+rte_atomic_fetch_add_explicit(&stats->packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Forbidden Atomic APIs in New Code
+
+New code **must not** use GCC/Clang `__atomic_*` built-ins or the
+legacy DPDK `rte_smp_*mb()` barriers. These are deprecated and
+will be removed. Use the DPDK C11 atomic wrappers instead.
+
+**GCC/Clang `__atomic_*` built-ins — do not use:**
+
+```c
+/* BAD - GCC built-in, not portable, not DPDK API */
+val = __atomic_load_n(&shared->count, __ATOMIC_RELAXED);
+__atomic_store_n(&shared->flag, 1, __ATOMIC_RELEASE);
+__atomic_fetch_add(&shared->counter, 1, __ATOMIC_RELAXED);
+__atomic_compare_exchange_n(&shared->state, &expected, desired,
+    0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+__atomic_thread_fence(__ATOMIC_SEQ_CST);
+
+/* GOOD - DPDK C11 atomic wrappers */
+val = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+rte_atomic_fetch_add_explicit(&shared->counter, 1, rte_memory_order_relaxed);
+rte_atomic_compare_exchange_strong_explicit(&shared->state, &expected, desired,
+    rte_memory_order_acq_rel, rte_memory_order_acquire);
+rte_atomic_thread_fence(rte_memory_order_seq_cst);
+```
+
+Similarly, do not use `__sync_*` built-ins (`__sync_fetch_and_add`,
+`__sync_bool_compare_and_swap`, etc.) — these are the older GCC
+atomics with implicit full barriers and are even less appropriate
+than `__atomic_*`.
+
+**Legacy DPDK barriers — do not use:**
+
+```c
+/* BAD - legacy DPDK barriers, deprecated */
+rte_smp_mb();            /* full memory barrier */
+rte_smp_rmb();           /* read memory barrier */
+rte_smp_wmb();           /* write memory barrier */
+
+/* GOOD - C11 fence with explicit ordering */
+rte_atomic_thread_fence(rte_memory_order_seq_cst);   /* replaces rte_smp_mb() */
+rte_atomic_thread_fence(rte_memory_order_acquire);    /* replaces rte_smp_rmb() */
+rte_atomic_thread_fence(rte_memory_order_release);    /* replaces rte_smp_wmb() */
+
+/* BETTER - use ordering on the atomic operation itself when possible */
+val = rte_atomic_load_explicit(&shared->flag, rte_memory_order_acquire);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+The legacy `rte_atomic16/32/64_*()` type-specific functions (e.g.,
+`rte_atomic32_inc()`, `rte_atomic64_read()`) are also deprecated.
+Use `rte_atomic_fetch_add_explicit()`, `rte_atomic_load_explicit()`,
+etc. with standard C integer types.
+
+| Deprecated API | Replacement |
+|----------------|-------------|
+| `__atomic_load_n()` | `rte_atomic_load_explicit()` |
+| `__atomic_store_n()` | `rte_atomic_store_explicit()` |
+| `__atomic_fetch_add()` | `rte_atomic_fetch_add_explicit()` |
+| `__atomic_compare_exchange_n()` | `rte_atomic_compare_exchange_strong_explicit()` |
+| `__atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+| `__ATOMIC_RELAXED` | `rte_memory_order_relaxed` |
+| `__ATOMIC_ACQUIRE` | `rte_memory_order_acquire` |
+| `__ATOMIC_RELEASE` | `rte_memory_order_release` |
+| `__ATOMIC_ACQ_REL` | `rte_memory_order_acq_rel` |
+| `__ATOMIC_SEQ_CST` | `rte_memory_order_seq_cst` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence(rte_memory_order_seq_cst)` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence(rte_memory_order_acquire)` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence(rte_memory_order_release)` |
+| `rte_atomic32_inc(&v)` | `rte_atomic_fetch_add_explicit(&v, 1, rte_memory_order_relaxed)` |
+| `rte_atomic64_read(&v)` | `rte_atomic_load_explicit(&v, rte_memory_order_relaxed)` |
+
+#### Memory Ordering Guide
+
+Use the weakest ordering that is correct. Stronger ordering
+constrains hardware and compiler optimization unnecessarily.
+
+| DPDK Ordering | When to Use |
+|---------------|-------------|
+| `rte_memory_order_relaxed` | Statistics counters, polling flags where no other data depends on the value. Most common for simple counters. |
+| `rte_memory_order_acquire` | **Load** side of a flag/pointer that guards access to other shared data. Ensures subsequent reads see data published by the releasing thread. |
+| `rte_memory_order_release` | **Store** side of a flag/pointer that publishes shared data. Ensures all prior writes are visible to a thread that does an acquire load. |
+| `rte_memory_order_acq_rel` | Read-modify-write operations (e.g., `fetch_add`) that both consume and publish shared state in one operation. |
+| `rte_memory_order_seq_cst` | Rarely needed. Only when multiple independent atomic variables must be observed in a globally consistent total order. Avoid unless required. |
+
+**Common pattern — producer/consumer flag:**
+
+```c
+/* Producer thread: fill buffer, then signal ready */
+fill_buffer(buf, data, len);
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+
+/* Consumer thread: wait for flag, then read buffer */
+while (!rte_atomic_load_explicit(&shared->ready, rte_memory_order_acquire))
+    rte_pause();
+process_buffer(buf, len);  /* guaranteed to see producer's writes */
+```
+
+**Common pattern — statistics counter (no ordering needed):**
+
+```c
+rte_atomic_fetch_add_explicit(&port_stats->rx_packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Standalone Fences
+
+Prefer ordering on the atomic operation itself (acquire load,
+release store) over standalone fences. Standalone fences
+(`rte_atomic_thread_fence()`) are a blunt instrument that
+orders ALL memory accesses around the fence, not just the
+atomic variable you care about.
+
+```c
+/* Acceptable but less precise - standalone fence */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_relaxed);
+rte_atomic_thread_fence(rte_memory_order_release);
+
+/* Preferred - ordering on the operation itself */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+Standalone fences are appropriate when synchronizing multiple
+non-atomic writes (e.g., filling a structure before publishing
+a pointer to it) where annotating each write individually is
+impractical.
+
+#### When volatile Is Still Acceptable
+
+`volatile` remains correct for:
+- Memory-mapped I/O registers (hardware MMIO)
+- Variables shared with signal handlers in single-threaded contexts
+- Interaction with `setjmp`/`longjmp`
+
+`volatile` is NOT correct for:
+- Any variable accessed by multiple threads
+- Polling flags between lcores
+- Statistics counters updated from multiple threads
+- Flags set by one thread and read by another
+
+**Do NOT flag** `volatile` used for MMIO or hardware register access
+(common in drivers under `drivers/*/base/`).
+
+### Threading
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `pthread_create()` | `rte_thread_create()` |
+| `pthread_join()` | `rte_thread_join()` |
+| `pthread_detach()` | EAL thread functions |
+| `pthread_setaffinity_np()` | `rte_thread_set_affinity()` |
+| `rte_thread_set_name()` | `rte_thread_set_prefixed_name()` |
+| `rte_thread_create_control()` | `rte_thread_create_internal_control()` |
+
+### Process-Shared Synchronization
+
+When placing synchronization primitives in shared memory (memory accessible by multiple processes, such as DPDK primary/secondary processes or `mmap`'d regions), they **must** be initialized with process-shared attributes. Failure to do so causes **undefined behavior** that may appear to work in testing but fail unpredictably in production.
+
+#### pthread Mutexes in Shared Memory
+
+**This is an error** - mutex in shared memory without `PTHREAD_PROCESS_SHARED`:
+
+```c
+/* BAD - undefined behavior when used across processes */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+void init_shared(struct shared_data *shm) {
+	pthread_mutex_init(&shm->lock, NULL);  /* ERROR: missing pshared attribute */
+}
+```
+
+**Correct implementation**:
+
+```c
+/* GOOD - properly initialized for cross-process use */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+int init_shared(struct shared_data *shm) {
+	pthread_mutexattr_t attr;
+	int ret;
+
+	ret = pthread_mutexattr_init(&attr);
+	if (ret != 0)
+		return -ret;
+
+	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+	if (ret != 0) {
+		pthread_mutexattr_destroy(&attr);
+		return -ret;
+	}
+
+	ret = pthread_mutex_init(&shm->lock, &attr);
+	pthread_mutexattr_destroy(&attr);
+
+	return -ret;
+}
+```
+
+#### pthread Condition Variables in Shared Memory
+
+Condition variables also require the process-shared attribute:
+
+```c
+/* BAD - will not work correctly across processes */
+pthread_cond_init(&shm->cond, NULL);
+
+/* GOOD */
+pthread_condattr_t cattr;
+pthread_condattr_init(&cattr);
+pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+pthread_cond_init(&shm->cond, &cattr);
+pthread_condattr_destroy(&cattr);
+```
+
+#### pthread Read-Write Locks in Shared Memory
+
+```c
+/* BAD */
+pthread_rwlock_init(&shm->rwlock, NULL);
+
+/* GOOD */
+pthread_rwlockattr_t rwattr;
+pthread_rwlockattr_init(&rwattr);
+pthread_rwlockattr_setpshared(&rwattr, PTHREAD_PROCESS_SHARED);
+pthread_rwlock_init(&shm->rwlock, &rwattr);
+pthread_rwlockattr_destroy(&rwattr);
+```
+
+#### When to Flag This Issue
+
+Flag as an **Error** when ALL of the following are true:
+1. A `pthread_mutex_t`, `pthread_cond_t`, `pthread_rwlock_t`, or `pthread_barrier_t` is initialized
+2. The primitive is stored in shared memory (identified by context such as: structure in `rte_malloc`/`rte_memzone`, `mmap`'d memory, memory passed to secondary processes, or structures documented as shared)
+3. The initialization uses `NULL` attributes or attributes without `PTHREAD_PROCESS_SHARED`
+
+**Do NOT flag** when:
+- The mutex is in thread-local or process-private heap memory (`malloc`)
+- The mutex is a local/static variable not in shared memory
+- The code already uses `pthread_mutexattr_setpshared()` with `PTHREAD_PROCESS_SHARED`
+- The synchronization uses DPDK primitives (`rte_spinlock_t`, `rte_rwlock_t`) which are designed for shared memory
+
+#### Preferred Alternatives
+
+For DPDK code, prefer DPDK's own synchronization primitives which are designed for shared memory:
+
+| pthread Primitive | DPDK Alternative |
+|-------------------|------------------|
+| `pthread_mutex_t` | `rte_spinlock_t` (busy-wait) or properly initialized pthread mutex |
+| `pthread_rwlock_t` | `rte_rwlock_t` |
+| `pthread_spinlock_t` | `rte_spinlock_t` |
+
+Note: `rte_spinlock_t` and `rte_rwlock_t` work correctly in shared memory without special initialization, but they are spinning locks unsuitable for long wait times.
+
+### Compiler Built-ins and Attributes
+
+| Forbidden | Preferred | Notes |
+|-----------|-----------|-------|
+| `__attribute__` | RTE macros in `rte_common.h` | Except in `lib/eal/include/rte_common.h` |
+| `__alignof__` | C11 `alignof` | |
+| `__typeof__` | `typeof` | |
+| `__builtin_*` | EAL macros | Except in `lib/eal/` and `drivers/*/base/` |
+| `__reserved` | Different name | Reserved in Windows headers |
+| `#pragma` / `_Pragma` | Avoid | Except in `rte_common.h` |
+
+### Format Specifiers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `%lld`, `%llu`, `%llx` | `%PRId64`, `%PRIu64`, `%PRIx64` |
+
+### Headers and Build
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `#include <linux/pci_regs.h>` | `#include <rte_pci.h>` | |
+| `install_headers()` | Meson `headers` variable | meson.build |
+| `-DALLOW_EXPERIMENTAL_API` | Not in lib/drivers/app | Build flags |
+| `allow_experimental_apis` | Not in lib/drivers/app | Meson |
+| `#undef XXX` | `// XXX is not set` | config/rte_config.h |
+| Driver headers (`*_driver.h`, `*_pmd.h`) | Public API headers | app/, examples/ |
+
+### Testing
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `REGISTER_TEST_COMMAND` | `REGISTER_<suite_name>_TEST` |
+
+### Documentation
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `http://...dpdk.org` | `https://...dpdk.org` |
+| `//doc.dpdk.org/guides/...` | `:ref:` or `:doc:` Sphinx references |
+| `::  file.svg` | `::  file.*` (wildcard extension) |
+
+---
+
+## Deprecated API Usage
+
+New patches must not introduce usage of deprecated APIs, macros, or functions.
+Deprecated items are marked with `RTE_DEPRECATED` or documented in the
+deprecation notices section of the release notes.
+
+### Rules for New Code
+
+- Do not call functions marked with `RTE_DEPRECATED` or `__rte_deprecated`
+- Do not use macros that have been superseded by newer alternatives
+- Do not use data structures or enum values marked as deprecated
+- Check `doc/guides/rel_notes/deprecation.rst` for planned deprecations
+- When a deprecated API has a replacement, use the replacement
+
+### Deprecating APIs
+
+A patch may mark an API as deprecated provided:
+
+- No remaining usages exist in the current DPDK codebase
+- The deprecation is documented in the release notes
+- A migration path or replacement API is documented
+- The `RTE_DEPRECATED` macro is used to generate compiler warnings
+
+```c
+/* Marking a function as deprecated */
+__rte_deprecated
+int
+rte_old_function(void);
+
+/* With a message pointing to the replacement */
+__rte_deprecated_msg("use rte_new_function() instead")
+int
+rte_old_function(void);
+```
+
+### Common Deprecated Patterns
+
+| Deprecated | Replacement | Notes |
+|-----------|-------------|-------|
+| `rte_atomic*_t` types | C11 atomics | Use `rte_atomic_xxx()` wrappers |
+| `rte_smp_*mb()` barriers | `rte_atomic_thread_fence()` | See Atomics section |
+| `pthread_*()` in portable code | `rte_thread_*()` | See Threading section |
+
+When reviewing patches that add new code, flag any usage of deprecated APIs
+as requiring change to use the modern replacement.
+
+---
+
+## API Tag Requirements
+
+### `__rte_experimental`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_experimental
+int
+rte_new_feature(void);
+
+/* Wrong - not alone on line */
+__rte_experimental int rte_new_feature(void);
+
+/* Wrong - in .c file */
+```
+
+### `__rte_internal`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_internal
+int
+internal_function(void);
+```
+
+### Alignment Attributes
+
+`__rte_aligned`, `__rte_cache_aligned`, `__rte_cache_min_aligned` may only be used with `struct` or `union` types:
+
+```c
+/* Correct */
+struct __rte_cache_aligned my_struct {
+	/* ... */
+};
+
+/* Wrong */
+int __rte_cache_aligned my_variable;
+```
+
+### Packed Attributes
+
+- `__rte_packed_begin` must follow `struct`, `union`, or alignment attributes
+- `__rte_packed_begin` and `__rte_packed_end` must be used in pairs
+- Cannot use `__rte_packed_begin` with `enum`
+
+```c
+/* Correct */
+struct __rte_packed_begin my_packed_struct {
+	/* ... */
+} __rte_packed_end;
+
+/* Wrong - with enum */
+enum __rte_packed_begin my_enum {
+	/* ... */
+};
+```
+
+---
+
+## Code Quality Requirements
+
+### Compilation
+
+- Each commit must compile independently (for `git bisect`)
+- No forward dependencies within a patchset
+- Test with multiple targets, compilers, and options
+- Use `devtools/test-meson-builds.sh`
+
+**Note for AI reviewers**: You cannot verify compilation order or cross-patch dependencies from patch review alone. Do NOT flag patches claiming they "would fail to compile" based on symbols used in other patches in the series. Assume the patch author has ordered them correctly.
+
+### Testing
+
+- Add tests to `app/test` unit test framework
+- New API functions must be used in `/app` test directory
+- New device APIs require at least one driver implementation
+
+#### Functional Test Infrastructure
+
+Standalone functional tests should use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure for consistency and proper integration with the DPDK test framework.
+
+```c
+#include <rte_test.h>
+
+static int
+test_feature_basic(void)
+{
+	int ret;
+
+	ret = rte_feature_init();
+	TEST_ASSERT_SUCCESS(ret, "Failed to initialize feature");
+
+	ret = rte_feature_operation();
+	TEST_ASSERT_EQUAL(ret, 0, "Operation returned unexpected value");
+
+	TEST_ASSERT_NOT_NULL(rte_feature_get_ptr(),
+		"Feature pointer should not be NULL");
+
+	return TEST_SUCCESS;
+}
+
+static struct unit_test_suite feature_testsuite = {
+	.suite_name = "feature_autotest",
+	.setup = test_feature_setup,
+	.teardown = test_feature_teardown,
+	.unit_test_cases = {
+		TEST_CASE(test_feature_basic),
+		TEST_CASE(test_feature_advanced),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_feature(void)
+{
+	return unit_test_suite_runner(&feature_testsuite);
+}
+
+REGISTER_FAST_TEST(feature_autotest, NOHUGE_OK, ASAN_OK, test_feature);
+```
+
+The `REGISTER_FAST_TEST` macro parameters are:
+- Test name (e.g., `feature_autotest`)
+- `NOHUGE_OK` or `HUGEPAGES_REQUIRED` - whether test can run without hugepages
+- `ASAN_OK` or `ASAN_FAILS` - whether test is compatible with Address Sanitizer
+- Test function name
+
+Common `TEST_ASSERT` macros:
+- `TEST_ASSERT(cond, msg, ...)` - Assert condition is true
+- `TEST_ASSERT_SUCCESS(val, msg, ...)` - Assert value equals 0
+- `TEST_ASSERT_FAIL(val, msg, ...)` - Assert value is non-zero
+- `TEST_ASSERT_EQUAL(a, b, msg, ...)` - Assert two values are equal
+- `TEST_ASSERT_NOT_EQUAL(a, b, msg, ...)` - Assert two values differ
+- `TEST_ASSERT_NULL(val, msg, ...)` - Assert value is NULL
+- `TEST_ASSERT_NOT_NULL(val, msg, ...)` - Assert value is not NULL
+
+### Documentation
+
+- Add Doxygen comments for public APIs
+- Update release notes in `doc/guides/rel_notes/` for important changes
+- Code and documentation must be updated atomically in same patch
+- Only update the **current release** notes file
+- Documentation must match the code
+- PMD features must match the features matrix in `doc/guides/nics/features/`
+- Documentation must match device operations (see `doc/guides/nics/features.rst` for the mapping between features, `eth_dev_ops`, and related APIs)
+- Release notes are NOT required for:
+  - Test-only changes (unit tests, functional tests)
+  - Internal APIs and helper functions (not exported to applications)
+  - Internal implementation changes that don't affect public API
+
+### RST Documentation Style
+
+When reviewing `.rst` documentation files, prefer **definition lists**
+over simple bullet lists where each item has a term and a description.
+Definition lists produce better-structured HTML/PDF output and are
+easier to scan.
+
+**When to suggest a definition list:**
+- A bullet list where each item starts with a bold or emphasized term
+  followed by a dash, colon, or long explanation
+- Lists of options, parameters, configuration values, or features
+  where each entry has a name and a description
+- Glossary-style enumerations
+
+**When a simple list is fine (do NOT flag):**
+- Short lists of items without descriptions (e.g., file names, steps)
+- Lists where items are single phrases or sentences with no term/definition structure
+- Enumerated steps in a procedure
+
+**RST definition list syntax:**
+
+```rst
+term 1
+   Description of term 1.
+
+term 2
+   Description of term 2.
+   Can span multiple lines.
+```
+
+**Example — flag this pattern:**
+
+```rst
+* **error** - Fail with error (default)
+* **truncate** - Truncate content to fit token limit
+* **summary** - Request high-level summary review
+```
+
+**Suggest rewriting as:**
+
+```rst
+error
+   Fail with error (default).
+
+truncate
+   Truncate content to fit token limit.
+
+summary
+   Request high-level summary review.
+```
+
+This is a **Warning**-level suggestion, not an Error. Do not flag it
+when the existing list structure is appropriate (see "when a simple
+list is fine" above).
+
+### API and Driver Changes
+
+- New APIs must be marked as `__rte_experimental`
+- New APIs must have hooks in `app/testpmd` and tests in the functional test suite
+- Changes to existing APIs require release notes
+- New drivers or subsystems must have release notes
+- Internal APIs (used only within DPDK, not exported to applications) do NOT require release notes
+
+### ABI Compatibility and Symbol Exports
+
+**IMPORTANT**: DPDK uses automatic symbol map generation. Do **NOT** recommend
+manually editing `version.map` files - they are auto-generated from source code
+annotations.
+
+#### Symbol Export Macros
+
+New public functions must be annotated with export macros (defined in
+`rte_export.h`). Place the macro on the line immediately before the function
+definition in the `.c` file:
+
+```c
+/* For stable ABI symbols */
+RTE_EXPORT_SYMBOL(rte_foo_create)
+int
+rte_foo_create(struct rte_foo_config *config)
+{
+    /* ... */
+}
+
+/* For experimental symbols (include version when first added) */
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_foo_new_feature, 25.03)
+__rte_experimental
+int
+rte_foo_new_feature(void)
+{
+    /* ... */
+}
+
+/* For internal symbols (shared between DPDK components only) */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_foo_internal_helper)
+int
+rte_foo_internal_helper(void)
+{
+    /* ... */
+}
+```
+
+#### Symbol Export Rules
+
+- `RTE_EXPORT_SYMBOL` - Use for stable ABI functions
+- `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, ver)` - Use for new experimental APIs
+  (version is the DPDK release, e.g., `25.03`)
+- `RTE_EXPORT_INTERNAL_SYMBOL` - Use for functions shared between DPDK libs/drivers
+  but not part of public API
+- Export macros go in `.c` files, not headers
+- The build system generates linker version maps automatically
+
+#### What NOT to Review
+
+- Do **NOT** flag missing `version.map` updates - maps are auto-generated
+- Do **NOT** suggest adding symbols to `lib/*/version.map` files
+
+#### ABI Versioning for Changed Functions
+
+When changing the signature of an existing stable function, use versioning macros
+from `rte_function_versioning.h`:
+
+- `RTE_VERSION_SYMBOL` - Create versioned symbol for backward compatibility
+- `RTE_DEFAULT_SYMBOL` - Mark the new default version
+
+Follow ABI policy and versioning guidelines in the contributor documentation.
+Enable ABI checks with `DPDK_ABI_REF_VERSION` environment variable.
+
+---
+
+## LTS (Long Term Stable) Release Review
+
+LTS releases are DPDK versions ending in `.11` (e.g., 23.11, 22.11,
+21.11, 20.11, 19.11). When reviewing patches targeting an LTS branch,
+apply stricter criteria:
+
+### LTS-Specific Rules
+
+- **Only bug fixes allowed** -- no new features
+- **No new APIs** (experimental or stable)
+- **ABI must remain unchanged** -- no symbol additions, removals,
+  or signature changes
+- Backported fixes should reference the original commit with a
+  `Fixes:` tag
+- Copyright years should reflect when the code was originally
+  written
+- Be conservative: reject changes that are not clearly bug fixes
+
+### What to Flag on LTS Branches
+
+**Error:**
+- New feature code (new functions, new driver capabilities)
+- New experimental or stable API additions
+- ABI changes (new or removed symbols, changed function signatures)
+- Changes that add new configuration options or parameters
+
+**Warning:**
+- Large refactoring that goes beyond what is needed for a fix
+- Missing `Fixes:` tag on a backported bug fix
+- Missing `Cc: stable@dpdk.org`
+
+### When LTS Rules Apply
+
+LTS rules apply when the reviewer is told the target release is an
+LTS version (via the `--release` option or equivalent). If no
+release is specified, assume the patch targets the main development
+branch where new features and APIs are allowed.
+
+---
+
+## Patch Validation Checklist
+
+### Commit Message and License
+
+Checked by `devtools/checkpatches.sh` -- not duplicated here.
+
+### Code Style
+
+- [ ] Lines <=100 characters
+- [ ] Hard tabs for indentation, spaces for alignment
+- [ ] No trailing whitespace
+- [ ] Proper include order
+- [ ] Header guards present
+- [ ] `rte_`/`RTE_` prefix on external symbols
+- [ ] Driver/library global variables use unique prefixes (e.g., `virtio_`, `mlx5_`)
+- [ ] No prohibited terminology
+- [ ] Proper brace style
+- [ ] Function return type on own line
+- [ ] Explicit comparisons: `== NULL`, `== 0`, `!= NULL`, `!= 0`
+- [ ] No forbidden tokens (see table above)
+- [ ] No unnecessary code patterns (see section above)
+- [ ] No usage of deprecated APIs, macros, or functions
+- [ ] Process-shared primitives in shared memory use `PTHREAD_PROCESS_SHARED`
+- [ ] `mmap()` return checked against `MAP_FAILED`, not `NULL`
+- [ ] Statistics use `+=` not `=` for accumulation
+- [ ] Integer multiplies widened before operation when result is 64-bit
+- [ ] Descriptor chain traversals bounded by ring size or loop counter
+- [ ] 64-bit bitmasks use `1ULL <<` or `RTE_BIT64()`, not `1 <<`
+- [ ] No unconditional variable overwrites before read
+- [ ] Nested loops use distinct counter variables
+- [ ] No `memcpy`/`memcmp` with identical source and destination pointers
+- [ ] `rte_pktmbuf_free_bulk()` not used on mixed-pool mbuf arrays (Tx paths, ring dequeue, error paths)
+- [ ] Static function pointer arrays declared `const` when contents are compile-time fixed
+- [ ] `bool` used for pure true/false variables, parameters, and predicate return types
+- [ ] Shared variables use `rte_atomic_*_explicit()`, not `volatile` or bare access
+- [ ] No `__atomic_*()` GCC built-ins or `__ATOMIC_*` ordering constants (use `rte_atomic_*_explicit()` and `rte_memory_order_*`)
+- [ ] No `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` (use `rte_atomic_thread_fence()`)
+- [ ] Memory ordering is the weakest correct choice (`relaxed` for counters, `acquire`/`release` for publish/consume)
+- [ ] Sensitive data cleared with `explicit_bzero()`/`rte_free_sensitive()`, not `memset()`
+
+### API Tags
+
+- [ ] `__rte_experimental` alone on line, only in headers
+- [ ] `__rte_internal` alone on line, only in headers
+- [ ] Alignment attributes only on struct/union
+- [ ] Packed attributes properly paired
+- [ ] New public functions have `RTE_EXPORT_*` macro in `.c` file
+- [ ] Experimental functions use `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, version)`
+
+### Structure
+
+- [ ] Each commit compiles independently
+- [ ] Code and docs updated together
+- [ ] Documentation matches code behavior
+- [ ] RST docs use definition lists for term/description patterns
+- [ ] PMD features match `doc/guides/nics/features/` matrix
+- [ ] Device operations match documentation (per `features.rst` mappings)
+- [ ] Tests added/updated as needed
+- [ ] Functional tests use TEST_ASSERT macros and unit_test_suite_runner
+- [ ] New APIs marked as `__rte_experimental`
+- [ ] New APIs have testpmd hooks and functional tests
+- [ ] Current release notes updated for significant changes
+- [ ] Release notes updated for API changes
+- [ ] Release notes updated for new drivers or subsystems
+
+---
+
+## Meson Build Files
+
+### Style Requirements
+
+- 4-space indentation (no tabs)
+- Line continuations double-indented
+- Lists alphabetically ordered
+- Short lists (<=3 items): single line, no trailing comma
+- Long lists: one item per line, trailing comma on last item
+- No strict line length limit for meson files; lines under 100 characters are acceptable
+
+```python
+# Short list
+sources = files('file1.c', 'file2.c')
+
+# Long list
+headers = files(
+	'header1.h',
+	'header2.h',
+	'header3.h',
+)
+```
+
+---
+
+## Python Code
+
+- Must comply with formatting standards
+- Use **`black`** for code formatting validation
+- Line length acceptable up to 100 characters
+
+---
+
+## Validation Tools
+
+Run these before submitting:
+
+```bash
+# Check commit messages
+devtools/check-git-log.sh -n1
+
+# Check patch format and forbidden tokens
+devtools/checkpatches.sh -n1
+
+# Check maintainers coverage
+devtools/check-maintainers.sh
+
+# Build validation
+devtools/test-meson-builds.sh
+
+# Find maintainers for your patch
+devtools/get-maintainer.sh <patch-file>
+```
+
+---
+
+## Severity Levels for AI Review
+
+**Error** (must fix):
+
+*Correctness bugs (highest value findings):*
+- Use-after-free
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference on reachable code path
+- Buffer overflow or out-of-bounds access
+- Missing error check on a function that can fail, leading to undefined behavior
+- Race condition on shared mutable state without synchronization
+- `volatile` used instead of atomics for inter-thread shared variables
+- `__atomic_*()` GCC built-ins in new code (must use `rte_atomic_*_explicit()`)
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` in new code (must use `rte_atomic_thread_fence()`)
+- Error path that skips necessary cleanup
+- `mmap()` return value checked against NULL instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=` (overwrite vs increment)
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied indices
+- `1 << n` used for 64-bit bitmask (undefined behavior if n >= 32)
+- Variable assigned then unconditionally overwritten before read
+- Same variable used as counter in nested loops
+- `memcpy`/`memcmp` with same pointer as both arguments (UB or no-op logic error)
+- `rte_pktmbuf_free_bulk()` on mbuf array where mbufs may come from different pools (Tx burst, ring dequeue)
+
+*Process and format errors:*
+- Forbidden tokens in code
+- `__rte_experimental`/`__rte_internal` in .c files or not alone on line
+- Compilation failures
+- ABI breaks without proper versioning
+- pthread mutex/cond/rwlock in shared memory without `PTHREAD_PROCESS_SHARED`
+
+*API design errors (new libraries only):*
+- Ops/callback struct with 20+ function pointers in an installed header
+- Callback struct members with no Doxygen documentation
+- Void-returning callbacks for failable operations (errors silently swallowed)
+
+**Warning** (should fix):
+- Missing Cc: stable@dpdk.org for fixes
+- Documentation gaps
+- Documentation does not match code behavior
+- PMD features missing from `doc/guides/nics/features/` matrix
+- Device operations not documented per `features.rst` mappings
+- Missing tests
+- Functional tests not using TEST_ASSERT macros or unit_test_suite_runner
+- New API not marked as `__rte_experimental`
+- New API without testpmd hooks or functional tests
+- New public function missing `RTE_EXPORT_*` macro
+- API changes without release notes
+- New drivers or subsystems without release notes
+- Implicit comparisons (`!ptr` instead of `ptr == NULL`)
+- Unnecessary variable initialization
+- Unnecessary casts of `void *`
+- Unnecessary NULL checks before free
+- Inappropriate use of `rte_malloc()` or `rte_memcpy()`
+- Use of `perror()`, `printf()`, `fprintf()` in libraries or drivers (allowed in examples and test code)
+- Driver/library global variables without unique prefixes (static linking clash risk)
+- Usage of deprecated APIs, macros, or functions in new code
+- RST documentation using bullet lists where definition lists would be more appropriate
+- Ops/callback struct with >5 function pointers in an installed header (ABI risk)
+- New API using fixed enum+union where TLV pattern would be more extensible
+- Installed header labeled "private" or "internal" in meson.build
+- New library using global singleton instead of handle-based API
+- Static function pointer array not declared `const` when contents are compile-time constant
+- `int` used instead of `bool` for variables or return values that are purely true/false
+- `rte_memory_order_seq_cst` used where weaker ordering (`relaxed`, `acquire`/`release`) suffices
+- Standalone `rte_atomic_thread_fence()` where ordering on the atomic operation itself would be clearer
+
+**Do NOT flag** (common false positives):
+- Missing `version.map` updates (maps are auto-generated from `RTE_EXPORT_*` macros)
+- Suggesting manual edits to any `version.map` file
+- SPDX/copyright format, copyright years, copyright holders (not subject to AI review)
+- Commit message formatting (subject length, punctuation, tag order, case-sensitive terms) -- checked by checkpatch
+- Meson file lines under 100 characters
+- Comparisons using `== 0`, `!= 0`, `== NULL`, `!= NULL` as "implicit" (these ARE explicit)
+- Comparisons wrapped in `likely()` or `unlikely()` macros - these are still explicit if using == or !=
+- Anything you determine is correct (do not mention non-issues or say "No issue here")
+- `REGISTER_FAST_TEST` using `NOHUGE_OK`/`ASAN_OK` macros (this is the correct current format)
+- Missing release notes for test-only changes (unit tests do not require release notes)
+- Missing release notes for internal APIs or helper functions (only public APIs need release notes)
+- Any item you later correct with "(Correction: ...)" or "actually acceptable" - just omit it
+- Vague concerns ("should be verified", "should be checked") - if you're not sure it's wrong, don't flag it
+- Items where you say "which is correct" or "this is correct" - if it's correct, don't mention it at all
+- Items where you conclude "no issue here" or "this is actually correct" - omit these entirely
+- Clean patches in a series - do not include a patch just to say "no issues" or describe what it does
+- Cross-patch compilation dependencies - you cannot determine patch ordering correctness from review
+- Claims that a symbol "was removed in patch N" causing issues in patch M - assume author ordered correctly
+- Any speculation about whether patches will compile when applied in sequence
+- Mutexes/locks in process-private memory (standard `malloc`, stack, static non-shared) - these don't need `PTHREAD_PROCESS_SHARED`
+- Use of `rte_spinlock_t` or `rte_rwlock_t` in shared memory (these work correctly without special init)
+- `volatile` used for MMIO/hardware register access in drivers (this is correct usage)
+
+**Info** (consider):
+- Minor style preferences
+- Optimization suggestions
+- Alternative approaches
+
+---
+
+# Response Format
+
+When you identify an issue:
+1. **State the problem** (1 sentence)
+2. **Why it matters** (1 sentence, only if not obvious)
+3. **Suggested fix** (code snippet or specific action)
+
+Example:
+This could panic if the string is NULL.
+
+---
+
+## FINAL CHECK BEFORE SUBMITTING REVIEW
+
+Before outputting your review, do two separate passes:
+
+### Pass 1: Verify correctness bugs are included
+
+Ask: "Did I trace every error path for resource leaks? Did I check
+for use-after-free? Did I verify error codes are propagated?"
+
+If you identified a potential correctness bug but talked yourself
+out of it, **add it back**. It is better to report a possible bug
+than to miss a real one.
+
+### Pass 2: Remove style/process false positives
+
+For EACH style/process item, ask: "Did I conclude this is actually
+fine/correct/acceptable/no issue?"
+
+If YES, DELETE THAT ITEM. It should not be in your output.
+
+An item that says "X is wrong... actually this is correct" is a
+FALSE POSITIVE and must be removed. This applies to style, format,
+and process items only.
+
+**If your Errors section would be empty after this check, that's
+fine -- it means the patches are good.**
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 2/6] devtools: add multi-provider AI patch review script
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
@ 2026-03-04 17:59     ` Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-04 17:59 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

This is an AI generated script to review DPDK patches against
the AGENTS.md coding guidelines using AI language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script reads a patch file and the AGENTS.md guidelines, then
submits them to the selected AI provider for review. Results are
organized by severity level (Error, Warning, Info) as defined in
the guidelines.

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large patches
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/analyze-patch.py 0001-net-ixgbe-fix-something.patch
  ./devtools/analyze-patch.py -p xai my-patch.patch
  ./devtools/analyze-patch.py -l  # list providers

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/analyze-patch.py | 1348 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1348 insertions(+)
 create mode 100755 devtools/analyze-patch.py

diff --git a/devtools/analyze-patch.py b/devtools/analyze-patch.py
new file mode 100755
index 0000000000..4a2950d6a4
--- /dev/null
+++ b/devtools/analyze-patch.py
@@ -0,0 +1,1348 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Analyze DPDK patches using AI providers.
+
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from datetime import date
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any, Iterator
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Large file handling modes
+LARGE_FILE_MODES = ["error", "truncate", "chunk", "commits-only", "summary"]
+
+# Approximate tokens per character (conservative estimate for code)
+CHARS_PER_TOKEN = 3.5
+
+# Default token limits by provider (leaving room for system prompt and response)
+PROVIDER_INPUT_LIMITS = {
+    "anthropic": 180000,  # 200K context, reserve for system/response
+    "openai": 900000,  # GPT-4.1 has 1M context
+    "xai": 1800000,  # Grok 4.1 Fast has 2M context
+    "google": 900000,  # Gemini 3 Flash has 1M context
+}
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# LTS releases: any DPDK release with minor version .11
+# (e.g., 19.11, 20.11, 21.11, 22.11, 23.11, 24.11, 25.11, ...)
+
+SYSTEM_PROMPT_BASE = """\
+You are an expert DPDK code reviewer. Analyze patches for compliance with \
+DPDK coding standards and contribution guidelines. Provide clear, actionable \
+feedback organized by severity (Error, Warning, Info) as defined in the \
+guidelines."""
+
+LTS_RULES = """
+LTS (Long Term Stable) branch rules apply:
+- Only bug fixes allowed, no new features
+- No new APIs (experimental or stable)
+- ABI must remain unchanged
+- Backported fixes should reference the original commit with Fixes: tag
+- Copyright years should reflect when the code was originally written
+- Be conservative: reject changes that aren't clearly bug fixes"""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """Provide your review in plain text format.""",
+    "markdown": """Provide your review in Markdown format with:
+- Headers (##) for each severity level (Errors, Warnings, Info)
+- Bullet points for individual issues
+- Code blocks (```) for code references
+- Bold (**) for emphasis on key points""",
+    "html": """Provide your review in HTML format with:
+- <h2> tags for each severity level (Errors, Warnings, Info)
+- <ul>/<li> for individual issues
+- <pre><code> for code references
+- <strong> for emphasis on key points
+- Use appropriate semantic HTML tags
+- Do NOT include <html>, <head>, or <body> tags - just the content""",
+    "json": """Provide your review in JSON format with this structure:
+{
+  "summary": "Brief one-line summary of the review",
+  "errors": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "warnings": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "info": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "passed_checks": ["list of checks that passed"],
+  "overall_status": "PASS|WARN|FAIL"
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """Please review the following DPDK patch file '{patch_name}' \
+against the AGENTS.md guidelines. Focus on:
+
+1. Correctness bugs (resource leaks, use-after-free, race conditions, etc.)
+2. C coding style (forbidden tokens, implicit comparisons, unnecessary patterns)
+3. API and documentation requirements
+4. Any other guideline violations
+
+Note: commit message formatting and SPDX/copyright compliance are checked \
+by checkpatches.sh and should NOT be flagged here.
+
+{format_instruction}
+
+--- PATCH CONTENT ---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def is_lts_release(release: str | None) -> bool:
+    """Check if a release is an LTS release.
+
+    Per DPDK project guidelines, any release with minor version .11
+    is an LTS release (e.g., 19.11, 21.11, 23.11, 24.11, 25.11).
+    """
+    if not release:
+        return False
+    # Check for explicit -lts suffix
+    if "-lts" in release.lower():
+        return True
+    # Extract base version (e.g., "23.11" from "23.11.1" or "23.11-rc1")
+    version = release.split("-")[0]
+    parts = version.split(".")
+    if len(parts) >= 2:
+        try:
+            minor = int(parts[1])
+            return minor == 11
+        except ValueError:
+            pass
+    return False
+
+
+def estimate_tokens(text: str) -> int:
+    """Estimate token count from text length."""
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def split_mbox_patches(content: str) -> list[str]:
+    """Split an mbox file into individual patches."""
+    patches = []
+    current_patch = []
+    in_patch = False
+
+    for line in content.split("\n"):
+        # Detect start of new message in mbox format
+        if line.startswith("From ") and (
+            " Mon " in line
+            or " Tue " in line
+            or " Wed " in line
+            or " Thu " in line
+            or " Fri " in line
+            or " Sat " in line
+            or " Sun " in line
+        ):
+            if current_patch:
+                patches.append("\n".join(current_patch))
+            current_patch = [line]
+            in_patch = True
+        elif in_patch:
+            current_patch.append(line)
+
+    # Don't forget the last patch
+    if current_patch:
+        patches.append("\n".join(current_patch))
+
+    return patches if patches else [content]
+
+
+def extract_commit_messages(content: str) -> str:
+    """Extract only commit messages from patch content."""
+    patches = split_mbox_patches(content)
+    messages = []
+
+    for patch in patches:
+        lines = patch.split("\n")
+        msg_lines = []
+        in_headers = True
+        in_body = False
+        found_subject = False
+
+        for line in lines:
+            # Collect headers we care about
+            if in_headers:
+                if line.startswith("Subject:"):
+                    msg_lines.append(line)
+                    found_subject = True
+                elif line.startswith(("From:", "Date:")):
+                    msg_lines.append(line)
+                elif line.startswith((" ", "\t")) and found_subject:
+                    # Subject continuation
+                    msg_lines.append(line)
+                elif line == "":
+                    if found_subject:
+                        in_headers = False
+                        in_body = True
+                        msg_lines.append("")
+            elif in_body:
+                # Stop at the diff
+                if line.startswith("---") and not line.startswith("----"):
+                    break
+                if line.startswith("diff --git"):
+                    break
+                msg_lines.append(line)
+
+        if msg_lines:
+            messages.append("\n".join(msg_lines))
+
+    return "\n\n---\n\n".join(messages)
+
+
+def truncate_content(
+    content: str, max_tokens: float, provider: str
+) -> tuple[str, bool]:
+    """Truncate content to fit within token limit."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN)
+
+    if len(content) <= max_chars:
+        return content, False
+
+    # Try to truncate at a reasonable boundary
+    truncated = content[:max_chars]
+
+    # Find last complete diff hunk or patch boundary
+    last_diff = truncated.rfind("\ndiff --git")
+    last_patch = truncated.rfind("\nFrom ")
+
+    if last_diff > max_chars * 0.5:
+        truncated = truncated[:last_diff]
+    elif last_patch > max_chars * 0.5:
+        truncated = truncated[:last_patch]
+
+    truncated += "\n\n[... Content truncated due to size limits ...]\n"
+    return truncated, True
+
+
+def chunk_content(
+    content: str, max_tokens: int, provider: str
+) -> Iterator[tuple[str, int, int]]:
+    """Split content into chunks that fit within token limit.
+
+    Yields tuples of (chunk_content, chunk_number, total_chunks).
+    """
+    patches = split_mbox_patches(content)
+
+    if len(patches) == 1:
+        # Single large patch - split by diff sections
+        yield from chunk_single_patch(content, max_tokens)
+        return
+
+    # Multiple patches - group them to fit within limits
+    chunks = []
+    current_chunk = []
+    current_size = 0
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)  # 90% to leave margin
+
+    for patch in patches:
+        patch_size = len(patch)
+        if current_size + patch_size > max_chars and current_chunk:
+            chunks.append("\n".join(current_chunk))
+            current_chunk = []
+            current_size = 0
+
+        if patch_size > max_chars:
+            # Single patch too large, truncate it
+            if current_chunk:
+                chunks.append("\n".join(current_chunk))
+                current_chunk = []
+                current_size = 0
+            truncated, _ = truncate_content(patch, max_tokens * 0.9, provider)
+            chunks.append(truncated)
+        else:
+            current_chunk.append(patch)
+            current_size += patch_size
+
+    if current_chunk:
+        chunks.append("\n".join(current_chunk))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def chunk_single_patch(content: str, max_tokens: int) -> Iterator[tuple[str, int, int]]:
+    """Split a single large patch by diff sections."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)
+
+    # Extract header (everything before first diff)
+    first_diff = content.find("\ndiff --git")
+    if first_diff == -1:
+        # No diff sections, just truncate
+        truncated, _ = truncate_content(content, max_tokens * 0.9, "anthropic")
+        yield truncated, 1, 1
+        return
+
+    header = content[: first_diff + 1]
+    diff_content = content[first_diff + 1 :]
+
+    # Split by diff sections
+    diffs = []
+    current_diff = []
+    for line in diff_content.split("\n"):
+        if line.startswith("diff --git") and current_diff:
+            diffs.append("\n".join(current_diff))
+            current_diff = []
+        current_diff.append(line)
+    if current_diff:
+        diffs.append("\n".join(current_diff))
+
+    # Group diffs into chunks
+    chunks = []
+    current_chunk_diffs = []
+    current_size = len(header)
+
+    for diff in diffs:
+        diff_size = len(diff)
+        if current_size + diff_size > max_chars and current_chunk_diffs:
+            chunks.append(header + "\n".join(current_chunk_diffs))
+            current_chunk_diffs = []
+            current_size = len(header)
+
+        if diff_size + len(header) > max_chars:
+            # Single diff too large
+            if current_chunk_diffs:
+                chunks.append(header + "\n".join(current_chunk_diffs))
+                current_chunk_diffs = []
+            truncated_diff = diff[: max_chars - len(header) - 100]
+            truncated_diff += "\n[... diff truncated ...]\n"
+            chunks.append(header + truncated_diff)
+            current_size = len(header)
+        else:
+            current_chunk_diffs.append(diff)
+            current_size += diff_size
+
+    if current_chunk_diffs:
+        chunks.append(header + "\n".join(current_chunk_diffs))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def get_summary_prompt() -> str:
+    """Get prompt modifications for summary mode."""
+    return """
+NOTE: This is a LARGE patch series. Provide a HIGH-LEVEL summary review only:
+- Focus on overall architecture and design concerns
+- Check commit message formatting across the series
+- Identify any obvious policy violations
+- Do NOT attempt detailed line-by-line code review
+- Summarize the scope and purpose of the changes
+"""
+
+
+def format_combined_reviews(
+    reviews: list[tuple[str, str]], output_format: str, patch_name: str
+) -> str:
+    """Combine multiple chunk/patch reviews into a single output."""
+    if output_format == "json":
+        combined = {
+            "patch_file": patch_name,
+            "sections": [
+                {"label": label, "review": review} for label, review in reviews
+            ],
+        }
+        return json.dumps(combined, indent=2)
+    elif output_format == "html":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"<h2>{label}</h2>\n{review}")
+        return "\n<hr>\n".join(sections)
+    elif output_format == "markdown":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"## {label}\n\n{review}")
+        return "\n\n---\n\n".join(sections)
+    else:  # text
+        sections = []
+        for label, review in reviews:
+            sections.append(f"=== {label} ===\n\n{review}")
+        return "\n\n" + "=" * 60 + "\n\n".join(sections)
+
+
+def build_system_prompt(review_date: str, release: str | None) -> str:
+    """Build system prompt with date and release context."""
+    prompt = SYSTEM_PROMPT_BASE
+    prompt += f"\n\nCurrent date: {review_date}."
+
+    if release:
+        prompt += f"\nTarget DPDK release: {release}."
+        if is_lts_release(release):
+            prompt += LTS_RULES
+        else:
+            prompt += "\nThis is a main branch or standard release."
+            prompt += "\nNew features and experimental APIs are allowed."
+
+    return prompt
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": system_prompt},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": system_prompt},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": system_prompt}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + patch_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+    verbose: bool = False,
+) -> str:
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: {usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: {usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def get_last_message_id(patch_content: str) -> str | None:
+    """Extract Message-ID from the last patch in an mbox."""
+    msg_ids = re.findall(
+        r"^Message-I[Dd]:\s*(.+)$", patch_content, re.MULTILINE | re.IGNORECASE
+    )
+    if msg_ids:
+        msg_id = msg_ids[-1].strip()
+        # Normalize: remove < > and add them back
+        msg_id = msg_id.strip("<>")
+        return f"<{msg_id}>"
+    return None
+
+
+def get_last_subject(patch_content: str) -> str | None:
+    """Extract subject from the last patch in an mbox."""
+    # Find all Subject lines with potential continuations
+    subjects = []
+    lines = patch_content.split("\n")
+    i = 0
+    while i < len(lines):
+        if lines[i].lower().startswith("subject:"):
+            subject = lines[i][8:].strip()
+            i += 1
+            # Handle continuation lines
+            while i < len(lines) and lines[i].startswith((" ", "\t")):
+                subject += lines[i].strip()
+                i += 1
+            subjects.append(subject)
+        else:
+            i += 1
+    return subjects[-1] if subjects else None
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+) -> bool:
+    """Send review email using git send-email, sendmail, or msmtp."""
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    email_text = msg.as_string()
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(email_text, file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Write to temp file for git send-email
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".eml", delete=False) as f:
+        f.write(email_text)
+        temp_file = f.name
+
+    try:
+        # Try git send-email first
+        if get_git_config("sendemail.smtpserver"):
+            # Build command with all arguments
+            flat_cmd = ["git", "send-email", "--confirm=never", "--quiet"]
+            for addr in to_addrs:
+                flat_cmd.extend(["--to", addr])
+            for addr in cc_addrs:
+                flat_cmd.extend(["--cc", addr])
+            if from_addr:
+                flat_cmd.extend(["--from", from_addr])
+            if in_reply_to:
+                flat_cmd.extend(["--in-reply-to", in_reply_to])
+            flat_cmd.append(temp_file)
+
+            try:
+                subprocess.run(flat_cmd, check=True, capture_output=True)
+                print("Email sent via git send-email", file=sys.stderr)
+                return True
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+
+        # Try sendmail
+        try:
+            subprocess.run(
+                ["sendmail", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via sendmail", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        # Try msmtp
+        try:
+            subprocess.run(
+                ["msmtp", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via msmtp", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        error("Could not send email. Configure git send-email, sendmail, or msmtp.")
+
+    finally:
+        os.unlink(temp_file)
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Analyze DPDK patches using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s patch.patch                    # Review with default settings
+    %(prog)s -p openai my-patch.patch       # Use OpenAI ChatGPT
+    %(prog)s -f markdown patch.patch        # Output as Markdown
+    %(prog)s -f json -o review.json patch.patch  # Save JSON to file
+    %(prog)s -f html -o review.html patch.patch  # Save HTML to file
+    %(prog)s -r 24.11 patch.patch           # Review for specific release
+    %(prog)s -r 24.11-lts patch.patch       # Review for LTS branch
+    %(prog)s --send-email --to dev@dpdk.org series.mbox
+    %(prog)s --send-email --to dev@dpdk.org --dry-run series.mbox
+
+Large File Handling:
+    %(prog)s --split-patches series.mbox    # Review each patch separately
+    %(prog)s --split-patches --patch-range 1-5 series.mbox  # Review patches 1-5
+    %(prog)s --large-file=truncate patch.mbox   # Truncate to fit limit
+    %(prog)s --large-file=commits-only series.mbox  # Review commit messages only
+    %(prog)s --large-file=summary series.mbox   # High-level summary only
+    %(prog)s --large-file=chunk series.mbox     # Split and review in chunks
+
+Large File Modes:
+    error       - Fail with error (default)
+    truncate    - Truncate content to fit token limit
+    chunk       - Split into chunks and review each
+    commits-only - Extract and review only commit messages
+    summary     - Request high-level summary review
+
+LTS Releases:
+    Use -r/--release with LTS version (e.g., 24.11-lts, 23.11) to enable
+    stricter review rules: bug fixes only, no new features or APIs.
+    Any DPDK release with minor version .11 is an LTS release.
+        """,
+    )
+
+    parser.add_argument("patch_file", nargs="?", help="Patch file to analyze")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=4096,
+        help="Max tokens for response (default: 4096)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output",
+        metavar="FILE",
+        help="Write output to file instead of stdout",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Date and release options
+    parser.add_argument(
+        "-D",
+        "--date",
+        metavar="YYYY-MM-DD",
+        help="Review date context (default: today)",
+    )
+    parser.add_argument(
+        "-r",
+        "--release",
+        metavar="VERSION",
+        help="Target DPDK release (e.g., 24.11, 23.11-lts)",
+    )
+
+    # Large file handling options
+    large_group = parser.add_argument_group("Large File Handling")
+    large_group.add_argument(
+        "--large-file",
+        choices=LARGE_FILE_MODES,
+        default="error",
+        metavar="MODE",
+        help="How to handle large files: error (default), truncate, "
+        "chunk, commits-only, summary",
+    )
+    large_group.add_argument(
+        "--max-tokens",
+        type=int,
+        metavar="N",
+        help="Max input tokens (default: provider-specific)",
+    )
+    large_group.add_argument(
+        "--split-patches",
+        action="store_true",
+        help="Split mbox into individual patches and review each separately",
+    )
+    large_group.add_argument(
+        "--patch-range",
+        metavar="N-M",
+        help="Review only patches N through M (1-indexed, use with --split-patches)",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check patch file is provided
+    if not args.patch_file:
+        parser.error("patch_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    patch_path = Path(args.patch_file)
+    if not patch_path.exists():
+        error(f"Patch file not found: {args.patch_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Determine review date
+    review_date = args.date or date.today().isoformat()
+
+    # Build system prompt with date and release context
+    system_prompt = build_system_prompt(review_date, args.release)
+
+    # Read files
+    agents_content = agents_path.read_text()
+    patch_content = patch_path.read_text()
+    patch_name = patch_path.name
+
+    # Determine max tokens for this provider
+    max_input_tokens = args.max_tokens or PROVIDER_INPUT_LIMITS.get(
+        args.provider, 100000
+    )
+
+    # Estimate token count
+    estimated_tokens = estimate_tokens(patch_content + agents_content)
+
+    # Parse patch range if specified
+    patch_start, patch_end = None, None
+    if args.patch_range:
+        try:
+            if "-" in args.patch_range:
+                start, end = args.patch_range.split("-", 1)
+                patch_start = int(start)
+                patch_end = int(end)
+            else:
+                patch_start = patch_end = int(args.patch_range)
+        except ValueError:
+            error(f"Invalid --patch-range format: {args.patch_range}")
+
+    # Handle --split-patches mode
+    if args.split_patches:
+        patches = split_mbox_patches(patch_content)
+        total_patches = len(patches)
+
+        if total_patches == 1:
+            print(
+                "Note: Only 1 patch found in mbox, --split-patches has no effect",
+                file=sys.stderr,
+            )
+        else:
+            print(
+                f"Found {total_patches} patches in mbox",
+                file=sys.stderr,
+            )
+
+            # Apply patch range filter
+            if patch_start is not None:
+                if patch_start < 1 or patch_start > total_patches:
+                    error(
+                        f"Patch range start {patch_start} out of range (1-{total_patches})"
+                    )
+                if patch_end < patch_start or patch_end > total_patches:
+                    error(
+                        f"Patch range end {patch_end} out of range ({patch_start}-{total_patches})"
+                    )
+                patches = patches[patch_start - 1 : patch_end]
+                print(
+                    f"Reviewing patches {patch_start}-{patch_end} ({len(patches)} patches)",
+                    file=sys.stderr,
+                )
+
+            # Review each patch separately
+            all_reviews = []
+            for i, patch in enumerate(patches, patch_start or 1):
+                patch_label = f"Patch {i}/{total_patches}"
+                print(f"\nReviewing {patch_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    patch,
+                    f"{patch_name} ({patch_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((patch_label, review_text))
+
+            # Combine reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal API call
+            estimated_tokens = 0  # Bypass size check since we've already processed
+
+    # Check if content is too large
+    is_large = estimated_tokens > max_input_tokens
+
+    if is_large:
+        print(
+            f"Warning: Estimated {estimated_tokens:,} tokens exceeds limit of "
+            f"{max_input_tokens:,}",
+            file=sys.stderr,
+        )
+
+        if args.large_file == "error":
+            error(
+                f"Patch file too large ({estimated_tokens:,} tokens). "
+                f"Use --large-file=truncate|chunk|commits-only|summary to handle, "
+                f"or --split-patches to review patches individually."
+            )
+        elif args.large_file == "truncate":
+            print("Truncating content to fit token limit...", file=sys.stderr)
+            patch_content, was_truncated = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+            if was_truncated:
+                print("Content was truncated.", file=sys.stderr)
+        elif args.large_file == "commits-only":
+            print("Extracting commit messages only...", file=sys.stderr)
+            patch_content = extract_commit_messages(patch_content)
+            new_estimate = estimate_tokens(patch_content + agents_content)
+            print(
+                f"Reduced to ~{new_estimate:,} tokens (commit messages only)",
+                file=sys.stderr,
+            )
+            if new_estimate > max_input_tokens:
+                patch_content, _ = truncate_content(
+                    patch_content, max_input_tokens, args.provider
+                )
+        elif args.large_file == "summary":
+            print("Using summary mode for large patch...", file=sys.stderr)
+            system_prompt += get_summary_prompt()
+            patch_content, _ = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+        elif args.large_file == "chunk":
+            print("Processing in chunks...", file=sys.stderr)
+            all_reviews = []
+            for chunk, chunk_num, total_chunks in chunk_content(
+                patch_content, max_input_tokens, args.provider
+            ):
+                chunk_label = f"Chunk {chunk_num}/{total_chunks}"
+                print(f"Reviewing {chunk_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    chunk,
+                    f"{patch_name} ({chunk_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((chunk_label, review_text))
+
+            # Combine chunk reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal single API call below
+            estimated_tokens = 0
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Review date: {review_date}", file=sys.stderr)
+        if args.release:
+            lts_status = " (LTS)" if is_lts_release(args.release) else ""
+            print(f"Target release: {args.release}{lts_status}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Patch file: {args.patch_file}", file=sys.stderr)
+        print(f"Estimated tokens: {estimated_tokens:,}", file=sys.stderr)
+        print(f"Max input tokens: {max_input_tokens:,}", file=sys.stderr)
+        if args.large_file != "error":
+            print(f"Large file mode: {args.large_file}", file=sys.stderr)
+        if args.split_patches:
+            print("Split patches: yes", file=sys.stderr)
+        if args.output:
+            print(f"Output file: {args.output}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API (unless already processed via chunks/split)
+    if estimated_tokens > 0:  # Not already processed
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            args.output_format,
+            args.verbose,
+        )
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Format output based on requested format
+    provider_name = config["name"]
+
+    if args.output_format == "json":
+        # For JSON, try to parse and add metadata
+        try:
+            review_data = json.loads(review_text)
+        except json.JSONDecodeError:
+            # If AI didn't return valid JSON, wrap the text
+            review_data = {"raw_review": review_text}
+
+        output_data = {
+            "metadata": {
+                "patch_file": patch_name,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+                "review_date": review_date,
+                "target_release": args.release,
+                "is_lts": is_lts_release(args.release) if args.release else False,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+    elif args.output_format == "html":
+        # Wrap HTML content with header
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"<br>Target release: {args.release}{lts_badge}"
+        output_text = f"""<!-- AI-generated review of {patch_name} -->
+<!-- Reviewed using {provider_name} ({model}) on {review_date} -->
+<div class="patch-review">
+<h1>Patch Review: {patch_name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model}) on {review_date}{release_info}</p>
+{review_text}
+</div>
+"""
+    elif args.output_format == "markdown":
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"\n*Target release: {args.release}{lts_badge}*\n"
+        output_text = f"""# Patch Review: {patch_name}
+
+*Reviewed by {provider_name} ({model}) on {review_date}*
+{release_info}
+{review_text}
+"""
+    else:  # text
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+        output_text = f"=== Patch Review: {patch_name} (via {provider_name}) ===\n"
+        output_text += f"Review date: {review_date}\n"
+        output_text += release_info
+        output_text += "\n" + review_text
+
+    # Write output
+    if args.output:
+        Path(args.output).write_text(output_text)
+        print(f"Review written to: {args.output}", file=sys.stderr)
+    else:
+        print(output_text)
+
+    # Send email if requested
+    if args.send_email:
+        # Email always uses plain text - warn if different format requested
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        in_reply_to = get_last_message_id(patch_content)
+        orig_subject = get_last_subject(patch_content)
+
+        if orig_subject:
+            # Remove [PATCH n/m] prefix
+            review_subject = re.sub(r"^\[PATCH[^\]]*\]\s*", "", orig_subject)
+            review_subject = f"[REVIEW] {review_subject}"
+        else:
+            review_subject = f"[REVIEW] {patch_name}"
+
+        # Build email body - always use plain text version
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+
+        email_body = f"""AI-generated review of {patch_name}
+Reviewed using {provider_name} ({model}) on {review_date}
+{release_info}
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print(f"In-Reply-To: {in_reply_to}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            in_reply_to,
+            email_body,
+            args.dry_run,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 3/6] devtools: add compare-reviews.sh for multi-provider analysis
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-03-04 17:59     ` Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-04 17:59 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 192 ++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100755 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100755
index 0000000000..a63eeffb71
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,192 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -e
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+VERBOSE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            FORMAT="$2"
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"
+        echo ""
+        echo "Saved to: $OUTPUT_FILE"
+    else
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE"
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 4/6] devtools: add multi-provider AI documentation review script
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (2 preceding siblings ...)
  2026-03-04 17:59     ` [PATCH v9 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
@ 2026-03-04 17:59     ` Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-04 17:59 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add review-doc.py script that reviews DPDK documentation files for
spelling, grammar, technical correctness, and clarity using AI
language models. Supports batch processing of multiple files.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

Output formats (-f/--format):
  - text: plain text with extractable diff/msg markers (default)
  - markdown: formatted review document
  - html: complete HTML document with styling
  - json: structured data with metadata

For each input file, the script produces:
  - <basename>.{txt,md,html,json}: review in selected format
  - <basename>.diff: unified diff (text/json, or with -d flag)
  - <basename>.msg: commit message (text/json, or with -d flag)

The commit message prefix is automatically determined from the
file path (e.g., doc/guides/prog_guide: for programmer's guide).

Features:
  - Multiple file processing with glob support
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Configurable output directory via -o/--output-dir option
  - Output format selection via -f/--format option
  - Force diff/msg generation via -d/--diff option
  - Quiet mode (-q) suppresses stdout output
  - Verbose mode (-v) shows token usage and API details
  - Email integration using git sendemail configuration
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/review-doc.py doc/guides/prog_guide/mempool_lib.rst
  ./devtools/review-doc.py doc/guides/nics/*.rst
  ./devtools/review-doc.py -f html -d -o /tmp doc/guides/nics/*.rst
  ./devtools/review-doc.py --send-email --to dev@dpdk.org file.rst

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/review-doc.py | 1099 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1099 insertions(+)
 create mode 100755 devtools/review-doc.py

diff --git a/devtools/review-doc.py b/devtools/review-doc.py
new file mode 100755
index 0000000000..c8a1988a10
--- /dev/null
+++ b/devtools/review-doc.py
@@ -0,0 +1,1099 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Review DPDK documentation files using AI providers.
+
+Produces a diff file and commit message compliant with DPDK standards.
+Accepts multiple documentation files and generates output for each.
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import getpass
+import json
+import os
+import re
+import smtplib
+import ssl
+import subprocess
+import sys
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Map output format to file extension
+FORMAT_EXTENSIONS = {
+    "text": ".txt",
+    "markdown": ".md",
+    "html": ".html",
+    "json": ".json",
+}
+
+# Additional markers for extracting diff/msg (used with --diff flag)
+DIFF_MARKERS_INSTRUCTION = """
+
+ADDITIONALLY, at the end of your response, include these exact markers for automated extraction:
+---COMMIT_MESSAGE_START---
+(same commit message as above)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(same unified diff as above)
+---UNIFIED_DIFF_END---
+"""
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# Commit prefix mappings based on file path
+COMMIT_PREFIX_MAP = [
+    ("doc/guides/prog_guide/", "doc/guides/prog_guide:"),
+    ("doc/guides/sample_app_ug/", "doc/guides/sample_app:"),
+    ("doc/guides/nics/", "doc/guides/nics:"),
+    ("doc/guides/cryptodevs/", "doc/guides/cryptodevs:"),
+    ("doc/guides/compressdevs/", "doc/guides/compressdevs:"),
+    ("doc/guides/eventdevs/", "doc/guides/eventdevs:"),
+    ("doc/guides/rawdevs/", "doc/guides/rawdevs:"),
+    ("doc/guides/bbdevs/", "doc/guides/bbdevs:"),
+    ("doc/guides/gpus/", "doc/guides/gpus:"),
+    ("doc/guides/dmadevs/", "doc/guides/dmadevs:"),
+    ("doc/guides/regexdevs/", "doc/guides/regexdevs:"),
+    ("doc/guides/mldevs/", "doc/guides/mldevs:"),
+    ("doc/guides/rel_notes/", "doc/guides/rel_notes:"),
+    ("doc/guides/linux_gsg/", "doc/guides/linux_gsg:"),
+    ("doc/guides/freebsd_gsg/", "doc/guides/freebsd_gsg:"),
+    ("doc/guides/windows_gsg/", "doc/guides/windows_gsg:"),
+    ("doc/guides/tools/", "doc/guides/tools:"),
+    ("doc/guides/testpmd_app_ug/", "doc/guides/testpmd:"),
+    ("doc/guides/howto/", "doc/guides/howto:"),
+    ("doc/guides/contributing/", "doc/guides/contributing:"),
+    ("doc/guides/platform/", "doc/guides/platform:"),
+    ("doc/guides/", "doc:"),
+    ("doc/api/", "doc/api:"),
+    ("doc/", "doc:"),
+]
+
+SYSTEM_PROMPT = """\
+You are an expert technical documentation reviewer for DPDK.
+Your task is to review documentation files and suggest improvements for:
+- Spelling errors
+- Grammar issues
+- Technical correctness
+- Clarity and readability
+- Consistency with DPDK terminology
+
+IMPORTANT COMMIT MESSAGE RULES (from check-git-log.sh):
+- Subject line MUST be ≤60 characters
+- Format: "prefix: lowercase description"
+- First word after colon must be lowercase (except acronyms like Rx, Tx, VF, MAC, API)
+- Use imperative mood (e.g., "fix typo" not "fixed typo" or "fixes typo")
+- NO trailing period on subject line
+- NO punctuation marks: , ; ! ? & |
+- NO underscores in subject after colon
+- Body lines wrapped at 75 characters
+- Body must NOT start with "It"
+- Do NOT include Signed-off-by (user adds via git commit --sign)
+- Only use "Fixes:" tag for actual errors in documentation, not style improvements
+
+Case-sensitive terms (must use exact case):
+- Rx, Tx (not RX, TX, rx, tx)
+- VF, PF (not vf, pf)
+- MAC, VLAN, RSS, API
+- Linux, Windows, FreeBSD
+
+For style/clarity improvements, do NOT use Fixes tag.
+For actual errors (wrong information, broken examples), include Fixes tag \
+if you can identify the commit."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """
+OUTPUT FORMAT:
+You must output exactly two sections:
+
+1. COMMIT_MESSAGE section containing the complete commit message
+2. UNIFIED_DIFF section containing the unified diff
+
+Use these exact markers:
+---COMMIT_MESSAGE_START---
+(commit message here)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(unified diff here)
+---UNIFIED_DIFF_END---
+
+The diff should be in unified format that can be applied with "git apply".
+If no changes are needed, output empty sections with a note.""",
+    "markdown": """
+OUTPUT FORMAT:
+Provide your review in Markdown format with:
+
+## Summary
+Brief description of changes
+
+## Commit Message
+```
+(complete commit message here, ready to use)
+```
+
+## Changes
+For each change:
+### Issue N: Brief title
+- **Location**: file path and line
+- **Problem**: description
+- **Fix**: suggested correction
+
+## Unified Diff
+```diff
+(unified diff here)
+```""",
+    "html": """
+OUTPUT FORMAT:
+Provide your review in HTML format with:
+- <h2> for sections (Summary, Commit Message, Changes, Diff)
+- <pre><code> for commit message and diff
+- <ul>/<li> for individual issues
+- Do NOT include <html>, <head>, or <body> tags - just the content
+
+Include sections for: Summary, Commit Message, Changes, Unified Diff""",
+    "json": """
+OUTPUT FORMAT:
+Provide your review as JSON with this structure:
+{
+  "summary": "Brief description of changes",
+  "commit_message": "Complete commit message ready to use",
+  "changes": [
+    {
+      "type": "spelling|grammar|technical|clarity|style",
+      "location": "line number or section",
+      "original": "original text",
+      "suggested": "corrected text",
+      "reason": "why this change"
+    }
+  ],
+  "diff": "unified diff as a string",
+  "stats": {
+    "total_issues": 0,
+    "spelling": 0,
+    "grammar": 0,
+    "technical": 0,
+    "clarity": 0
+  }
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """\
+Review the following DPDK documentation file and provide improvements.
+
+File path: {doc_file}
+Commit message prefix to use: {commit_prefix}
+
+{format_instruction}
+
+---DOCUMENT CONTENT---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_smtp_config() -> dict[str, Any]:
+    """Get SMTP configuration from git config sendemail settings."""
+    config = {
+        "server": get_git_config("sendemail.smtpserver"),
+        "port": get_git_config("sendemail.smtpserverport"),
+        "user": get_git_config("sendemail.smtpuser"),
+        "encryption": get_git_config("sendemail.smtpencryption"),
+        "password": get_git_config("sendemail.smtppass"),
+    }
+
+    # Set defaults
+    if not config["port"]:
+        if config["encryption"] == "ssl":
+            config["port"] = "465"
+        else:
+            config["port"] = "587"
+
+    # Convert port to int
+    if config["port"]:
+        config["port"] = int(config["port"])
+
+    return config
+
+
+def get_commit_prefix(filepath: str) -> str:
+    """Determine commit message prefix from file path."""
+    for prefix_path, prefix in COMMIT_PREFIX_MAP:
+        if filepath.startswith(prefix_path):
+            return prefix
+    return "doc:"
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": SYSTEM_PROMPT}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + doc_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+    verbose: bool = False,
+) -> str:
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: " f"{usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: " f"{usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def parse_review_text(review_text: str) -> tuple[str, str]:
+    """Extract commit message and diff from text format response."""
+    commit_msg = ""
+    diff = ""
+
+    # Extract commit message
+    msg_match = re.search(
+        r"---COMMIT_MESSAGE_START---\s*\n(.*?)\n---COMMIT_MESSAGE_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if msg_match:
+        commit_msg = msg_match.group(1).strip()
+
+    # Extract unified diff
+    diff_match = re.search(
+        r"---UNIFIED_DIFF_START---\s*\n(.*?)\n---UNIFIED_DIFF_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if diff_match:
+        diff = diff_match.group(1).strip()
+        # Clean up any markdown code fence if present
+        diff = re.sub(r"^```diff\s*\n?", "", diff)
+        diff = re.sub(r"\n?```\s*$", "", diff)
+
+    return commit_msg, diff
+
+
+def strip_diff_markers(text: str) -> str:
+    """Remove the diff/msg extraction markers from text."""
+    # Remove commit message markers and content
+    text = re.sub(
+        r"\n*---COMMIT_MESSAGE_START---\s*\n.*?\n---COMMIT_MESSAGE_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    # Remove unified diff markers and content
+    text = re.sub(
+        r"\n*---UNIFIED_DIFF_START---\s*\n.*?\n---UNIFIED_DIFF_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    return text.strip()
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+    verbose: bool = False,
+) -> bool:
+    """Send review email via SMTP using git sendemail config."""
+    # Build email message
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(msg.as_string(), file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Get SMTP configuration from git config
+    smtp_config = get_smtp_config()
+
+    if not smtp_config["server"]:
+        error("No SMTP server configured. Set git config sendemail.smtpserver")
+
+    server = smtp_config["server"]
+    port = smtp_config["port"]
+    user = smtp_config["user"]
+    encryption = smtp_config["encryption"]
+
+    # Get password from environment or git config, or prompt
+    password = os.environ.get("SMTP_PASSWORD") or smtp_config["password"]
+    if user and not password:
+        password = getpass.getpass(f"SMTP password for {user}@{server}: ")
+
+    if verbose:
+        print(f"SMTP server: {server}:{port}", file=sys.stderr)
+        print(f"SMTP user: {user or '(none)'}", file=sys.stderr)
+        print(f"Encryption: {encryption or 'starttls'}", file=sys.stderr)
+
+    # Collect all recipients
+    all_recipients = list(to_addrs)
+    if cc_addrs:
+        all_recipients.extend(cc_addrs)
+
+    try:
+        if encryption == "ssl":
+            # SSL/TLS connection from the start (port 465)
+            context = ssl.create_default_context()
+            with smtplib.SMTP_SSL(server, port, context=context) as smtp:
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+        else:
+            # STARTTLS (port 587) or plain (port 25)
+            with smtplib.SMTP(server, port) as smtp:
+                smtp.ehlo()
+                if encryption == "tls" or port == 587:
+                    context = ssl.create_default_context()
+                    smtp.starttls(context=context)
+                    smtp.ehlo()
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+
+        print(f"Email sent via SMTP ({server}:{port})", file=sys.stderr)
+        return True
+
+    except smtplib.SMTPAuthenticationError as e:
+        error(f"SMTP authentication failed: {e}")
+    except smtplib.SMTPException as e:
+        error(f"SMTP error: {e}")
+    except OSError as e:
+        error(f"Connection error to {server}:{port}: {e}")
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Review DPDK documentation files using AI providers. "
+        "Accepts multiple files and generates output for each.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s doc/guides/prog_guide/mempool_lib.rst
+    %(prog)s doc/guides/nics/*.rst              # Review all NIC docs
+    %(prog)s -p openai -o /tmp doc/guides/nics/ixgbe.rst doc/guides/nics/i40e.rst
+    %(prog)s -f html -d -o /tmp/reviews doc/guides/nics/*.rst  # HTML + diff files
+    %(prog)s -f json -o /tmp doc/guides/howto/flow_bifurcation.rst
+    %(prog)s --send-email --to dev@dpdk.org doc/guides/nics/ixgbe.rst
+
+Output files (in output-dir):
+    <basename>.txt|.md|.html|.json  Review in selected format
+    <basename>.diff                  Unified diff (text/json, or with --diff)
+    <basename>.msg                   Commit message (text/json, or with --diff)
+
+After review:
+    git apply <basename>.diff
+    git commit -sF <basename>.msg
+
+SMTP Configuration (from git config):
+    sendemail.smtpserver      SMTP server hostname
+    sendemail.smtpserverport  SMTP port (default: 587 for TLS, 465 for SSL)
+    sendemail.smtpuser        SMTP username
+    sendemail.smtpencryption  'tls' for STARTTLS, 'ssl' for SSL/TLS
+    sendemail.smtppass        SMTP password (or set SMTP_PASSWORD env var)
+
+Example git config:
+    git config --global sendemail.smtpserver smtp.gmail.com
+    git config --global sendemail.smtpserverport 587
+    git config --global sendemail.smtpuser yourname@gmail.com
+    git config --global sendemail.smtpencryption tls
+        """,
+    )
+
+    parser.add_argument(
+        "doc_files",
+        nargs="+",
+        metavar="doc_file",
+        help="Documentation file(s) to review",
+    )
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=8192,
+        help="Max tokens for response (default: 8192)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output-dir",
+        default=".",
+        help="Output directory for all output files (default: .)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-q",
+        "--quiet",
+        action="store_true",
+        help="Suppress review output to stdout (only write files)",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-d",
+        "--diff",
+        action="store_true",
+        help="Always produce .diff and .msg files (automatic for text/json)",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    # Validate all doc files exist before processing
+    doc_paths = []
+    for doc_file in args.doc_files:
+        doc_path = Path(doc_file)
+        if not doc_path.exists():
+            error(f"Documentation file not found: {doc_file}")
+        doc_paths.append((doc_file, doc_path))
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Read AGENTS.md once
+    agents_content = agents_path.read_text()
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    provider_name = config["name"]
+
+    # Process each file
+    num_files = len(doc_paths)
+    for file_idx, (doc_file, doc_path) in enumerate(doc_paths, 1):
+        if num_files > 1:
+            print(
+                f"\n{'=' * 60}",
+                file=sys.stderr,
+            )
+            print(
+                f"Processing file {file_idx}/{num_files}: {doc_file}",
+                file=sys.stderr,
+            )
+            print(
+                f"{'=' * 60}",
+                file=sys.stderr,
+            )
+
+        # Determine output filenames
+        doc_basename = doc_path.stem
+        diff_file = output_dir / f"{doc_basename}.diff"
+        msg_file = output_dir / f"{doc_basename}.msg"
+
+        # Get commit prefix
+        commit_prefix = get_commit_prefix(doc_file)
+
+        # Read doc content
+        doc_content = doc_path.read_text()
+
+        if args.verbose:
+            print("=== Request ===", file=sys.stderr)
+            print(f"Provider: {args.provider}", file=sys.stderr)
+            print(f"Model: {model}", file=sys.stderr)
+            print(f"Output format: {args.output_format}", file=sys.stderr)
+            print(f"AGENTS file: {args.agents}", file=sys.stderr)
+            print(f"Doc file: {doc_file}", file=sys.stderr)
+            print(f"Commit prefix: {commit_prefix}", file=sys.stderr)
+            print(f"Output dir: {args.output_dir}", file=sys.stderr)
+            if args.send_email:
+                print("Send email: yes", file=sys.stderr)
+                print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+                if args.cc_addrs:
+                    print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+                print(f"From: {from_addr}", file=sys.stderr)
+            print("===============", file=sys.stderr)
+
+        # Call API
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            args.output_format,
+            args.diff,
+            args.verbose,
+        )
+
+        if not review_text:
+            print(
+                f"Warning: No response received for {doc_file}",
+                file=sys.stderr,
+            )
+            continue
+
+        # Determine review output file
+        format_ext = FORMAT_EXTENSIONS[args.output_format]
+        review_file = output_dir / f"{doc_basename}{format_ext}"
+
+        # Determine if we should write diff/msg files
+        write_diff_msg = args.diff or args.output_format in ("text", "json")
+
+        # Extract commit message and diff first (before stripping markers)
+        commit_msg, diff = "", ""
+        if write_diff_msg:
+            if args.output_format == "json":
+                # Will extract from JSON below
+                pass
+            else:
+                # Parse from text format markers
+                commit_msg, diff = parse_review_text(review_text)
+
+        # For non-text formats with --diff, strip the markers from display output
+        display_text = review_text
+        if args.diff and args.output_format in ("markdown", "html"):
+            display_text = strip_diff_markers(review_text)
+
+        # Build formatted output text
+        if args.output_format == "text":
+            output_text = review_text
+        elif args.output_format == "json":
+            # Try to parse JSON response
+            try:
+                review_data = json.loads(review_text)
+            except json.JSONDecodeError:
+                print("Warning: Response is not valid JSON", file=sys.stderr)
+                review_data = {"raw_response": review_text}
+
+            # Extract diff/msg from JSON if present
+            if write_diff_msg:
+                if isinstance(review_data, dict) and "raw_response" not in review_data:
+                    commit_msg = review_data.get("commit_message", "")
+                    diff = review_data.get("diff", "")
+
+            # Add metadata
+            output_data = {
+                "metadata": {
+                    "doc_file": doc_file,
+                    "provider": args.provider,
+                    "provider_name": provider_name,
+                    "model": model,
+                    "commit_prefix": commit_prefix,
+                },
+                "review": review_data,
+            }
+            output_text = json.dumps(output_data, indent=2)
+        elif args.output_format == "markdown":
+            output_text = f"""# Documentation Review: {doc_path.name}
+
+*Reviewed by {provider_name} ({model})*
+
+{display_text}
+"""
+        elif args.output_format == "html":
+            output_text = f"""<!DOCTYPE html>
+<html>
+<head>
+<meta charset="utf-8">
+<title>Review: {doc_path.name}</title>
+<style>
+body {{ font-family: system-ui, sans-serif; max-width: 900px; margin: 2em auto; padding: 0 1em; }}
+h1 {{ color: #333; }}
+.review-meta {{ color: #666; font-style: italic; }}
+pre {{ background: #f5f5f5; padding: 1em; overflow-x: auto; }}
+</style>
+</head>
+<body>
+<h1>Documentation Review: {doc_path.name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+<div class="review-content">
+{display_text}
+</div>
+</body>
+</html>
+"""
+
+        # Write formatted review to file
+        review_file.write_text(output_text)
+        print(f"Review written to: {review_file}", file=sys.stderr)
+
+        # Write diff/msg files
+        if write_diff_msg:
+            if commit_msg:
+                msg_file.write_text(commit_msg + "\n")
+                print(f"Commit message written to: {msg_file}", file=sys.stderr)
+            else:
+                msg_file.write_text("# No commit message generated\n")
+                print("Warning: Could not extract commit message", file=sys.stderr)
+
+            if diff:
+                diff_file.write_text(diff + "\n")
+                print(f"Diff written to: {diff_file}", file=sys.stderr)
+            else:
+                diff_file.write_text("# No changes suggested\n")
+                print("Warning: Could not extract diff", file=sys.stderr)
+
+        # Print to stdout unless quiet (or multiple files without verbose)
+        show_stdout = not args.quiet and (num_files == 1 or args.verbose)
+        if show_stdout:
+            print(
+                f"\n=== Documentation Review: {doc_path.name} "
+                f"(via {provider_name}) ==="
+            )
+            print(output_text)
+
+            # Print usage instructions for text format
+            if args.output_format == "text":
+                print("\n=== Output Files ===")
+                print(f"Commit message: {msg_file}")
+                print(f"Diff file:      {diff_file}")
+                print("\nTo apply changes:")
+                print(f"  git apply {diff_file}")
+                print(f"  git commit -sF {msg_file}")
+
+        # Send email if requested
+        if args.send_email:
+            if args.output_format != "text":
+                print(
+                    f"Note: Email will be sent as plain text regardless of "
+                    f"--format={args.output_format}",
+                    file=sys.stderr,
+                )
+
+            review_subject = f"[REVIEW] {commit_prefix} {doc_path.name}"
+
+            # Build email body
+            email_body = f"""AI-generated documentation review of {doc_file}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+            if args.verbose:
+                print("", file=sys.stderr)
+                print("=== Email Details ===", file=sys.stderr)
+                print(f"Subject: {review_subject}", file=sys.stderr)
+                print("=====================", file=sys.stderr)
+
+            send_email(
+                args.to_addrs,
+                args.cc_addrs,
+                from_addr,
+                review_subject,
+                None,
+                email_body,
+                args.dry_run,
+                args.verbose,
+            )
+
+            if not args.dry_run:
+                print("", file=sys.stderr)
+                print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+    # Print summary for multiple files
+    if num_files > 1:
+        print(f"\n{'=' * 60}", file=sys.stderr)
+        print(f"Processed {num_files} files", file=sys.stderr)
+        print(f"Output directory: {output_dir}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 5/6] doc: add AI-assisted patch review to contributing guide
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (3 preceding siblings ...)
  2026-03-04 17:59     ` [PATCH v9 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-03-04 17:59     ` Stephen Hemminger
  2026-03-04 17:59     ` [PATCH v9 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-04 17:59 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a new section to the contributing guide describing the
analyze-patch.py script which uses AI providers to review patches
against DPDK coding standards before submission to the mailing list.

The new section covers basic usage, provider selection, patch series
handling, LTS release review, and output format options. A note
clarifies that AI review supplements but does not replace human
review.

Also add a reference to the script in the new driver guide's
test tools checklist.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/contributing/new_driver.rst |  2 +
 doc/guides/contributing/patches.rst    | 59 ++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/guides/contributing/new_driver.rst b/doc/guides/contributing/new_driver.rst
index 555e875329..6c0d356cfd 100644
--- a/doc/guides/contributing/new_driver.rst
+++ b/doc/guides/contributing/new_driver.rst
@@ -210,3 +210,5 @@ Be sure to run the following test tools per patch in a patch series:
 * `check-doc-vs-code.sh`
 * `check-spdx-tag.sh`
 * Build documentation and validate how output looks
+* Optionally run ``analyze-patch.py`` for AI-assisted review
+  (see :ref:`ai_assisted_review` in the Contributing Guide)
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index 5f554d47e6..1e50799c19 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -183,6 +183,10 @@ Make your planned changes in the cloned ``dpdk`` repo. Here are some guidelines
 
 * Code and related documentation must be updated atomically in the same patch.
 
+* Consider running the :ref:`AI-assisted review <ai_assisted_review>` tool
+  before submitting to catch common issues early.
+  This is encouraged but not required.
+
 Once the changes have been made you should commit them to your local repo.
 
 For small changes, that do not require specific explanations, it is better to keep things together in the
@@ -503,6 +507,61 @@ Additionally, when contributing to the DTS tool, patches should also be checked
 the ``dts-check-format.sh`` script in the ``devtools`` directory of the DPDK repo.
 To run the script, extra :ref:`Python dependencies <dts_deps>` are needed.
 
+
+.. _ai_assisted_review:
+
+AI-Assisted Patch Review
+------------------------
+
+Contributors may optionally use the ``devtools/analyze-patch.py`` script
+to get an AI-assisted review of patches before submitting them to the mailing list.
+The script checks patches against the DPDK coding standards and contribution
+guidelines documented in ``AGENTS.md``.
+
+The script supports multiple AI providers (Anthropic Claude, OpenAI ChatGPT,
+xAI Grok, Google Gemini).  An API key for the chosen provider must be set
+in the corresponding environment variable (see ``--list-providers``).
+
+Basic usage::
+
+   # Review a single patch (default provider: Anthropic Claude)
+   devtools/analyze-patch.py my-patch.patch
+
+   # Use a different provider
+   devtools/analyze-patch.py -p openai my-patch.patch
+
+   # Review for an LTS branch (enables stricter rules)
+   devtools/analyze-patch.py -r 24.11 my-patch.patch
+
+   # List available providers and their API key variables
+   devtools/analyze-patch.py --list-providers
+
+For a patch series in an mbox file, the ``--split-patches`` option reviews
+each patch individually::
+
+   devtools/analyze-patch.py --split-patches series.mbox
+
+   # Review only a range of patches
+   devtools/analyze-patch.py --split-patches --patch-range 1-5 series.mbox
+
+When reviewing for a Long Term Stable (LTS) release, use the ``-r`` option
+with the target version.  Any DPDK release with minor version ``.11``
+(e.g., 23.11, 24.11) is automatically recognized as LTS,
+and the script will enforce stricter rules: bug fixes only, no new features or APIs.
+
+Output can be formatted as plain text (default), Markdown, HTML, or JSON::
+
+   devtools/analyze-patch.py -f markdown -o review.md my-patch.patch
+
+The review guidelines in ``AGENTS.md`` focus on correctness bug detection
+and other DPDK-specific requirements. Commit message formatting and
+SPDX/copyright compliance are checked by ``checkpatches.sh`` and are
+not duplicated in the AI review.
+
+.. note::
+
+   Always verify AI suggestions before acting on them.
+
 .. _contrib_check_compilation:
 
 Checking Compilation
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 6/6] MAINTAINERS: add section for AI review tools
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (4 preceding siblings ...)
  2026-03-04 17:59     ` [PATCH v9 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
@ 2026-03-04 17:59     ` Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-04 17:59 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Thomas Monjalon

Add maintainer entries for the AI-assisted code review tooling:
AGENTS.md, analyze-patch.py, compare-reviews.sh, and
review-doc.py.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5eb8e9dc22..7c1a84274f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -109,6 +109,14 @@ F: license/
 F: .editorconfig
 F: .mailmap
 
+AI review tools
+M: Stephen Hemminger <stephen@networkplumber.org>
+M: Aaron Conole <aconole@redhat.com>
+F: AGENTS.md
+F: devtools/analyze-patch.py
+F: devtools/compare-reviews.sh
+F: devtools/review-doc.py
+
 Linux kernel uAPI headers
 M: Maxime Coquelin <maxime.coquelin@redhat.com>
 F: devtools/linux-uapi.sh
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 0/6] Add AGENTS and scripts for AI code review
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
                     ` (5 preceding siblings ...)
  2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-03-10  1:57   ` Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
                       ` (5 more replies)
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                     ` (2 subsequent siblings)
  9 siblings, 6 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-10  1:57 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add guidelines and tooling for AI-assisted code review of DPDK
patches.

AGENTS.md provides a two-tier review framework: correctness bugs
(resource leaks, use-after-free, race conditions) are reported at
>=50% confidence; style issues require >80% with false positive
suppression. Mechanical checks handled by checkpatches.sh are
excluded to avoid redundant findings.

The analyze-patch.py script supports multiple AI providers
(Anthropic, OpenAI, xAI, Google) with mbox splitting, prompt
caching, and direct SMTP sending.

v10 - add more checks about mtu, buffer size and scatter
      based of Ferruh's revision in 2024.

v9 - update AGENTS to reduce false positives
   - remove commit message/SPDX items from prompt (checkpatch's job).
   - update contributing guide text to match actual AGENTS.md coverage.

Stephen Hemminger (6):
  doc: add AGENTS.md for AI code review tools
  devtools: add multi-provider AI patch review script
  devtools: add compare-reviews.sh for multi-provider analysis
  devtools: add multi-provider AI documentation review script
  doc: add AI-assisted patch review to contributing guide
  MAINTAINERS: add section for AI review tools

 AGENTS.md                              | 2115 ++++++++++++++++++++++++
 MAINTAINERS                            |    8 +
 devtools/analyze-patch.py              | 1348 +++++++++++++++
 devtools/compare-reviews.sh            |  192 +++
 devtools/review-doc.py                 | 1099 ++++++++++++
 doc/guides/contributing/new_driver.rst |    2 +
 doc/guides/contributing/patches.rst    |   59 +
 7 files changed, 4823 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 devtools/analyze-patch.py
 create mode 100755 devtools/compare-reviews.sh
 create mode 100755 devtools/review-doc.py

-- 
2.51.0


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v10 1/6] doc: add AGENTS.md for AI code review tools
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
@ 2026-03-10  1:57     ` Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-10  1:57 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Provide structured guidelines for AI tools reviewing DPDK
patches. Focuses on correctness bug detection (resource leaks,
use-after-free, race conditions), C coding style, forbidden
tokens, API conventions, and severity classifications.

Mechanical checks already handled by checkpatches.sh (SPDX
format, commit message formatting, tag ordering) are excluded
to avoid redundant and potentially contradictory findings.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 AGENTS.md | 2115 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2115 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..70c057424e
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,2115 @@
+# AGENTS.md - DPDK Code Review Guidelines for AI Tools
+
+## CRITICAL INSTRUCTION - READ FIRST
+
+This document has two categories of review rules with different
+confidence thresholds:
+
+### 1. Correctness Bugs -- HIGHEST PRIORITY (report at >=50% confidence)
+
+**Always report potential correctness bugs.** These are the most
+valuable findings. When in doubt, report them with a note about
+your confidence level. A possible use-after-free or resource leak
+is worth mentioning even if you are not certain.
+
+Correctness bugs include:
+- Use-after-free (accessing memory after `free`/`rte_free`)
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference
+- Buffer overflows or out-of-bounds access
+- Uninitialized variable use in a reachable code path
+- Race conditions (unsynchronized shared state)
+- `volatile` used instead of atomic operations for inter-thread shared variables
+- `__atomic_load_n()`/`__atomic_store_n()`/`__atomic_*()` GCC built-ins instead of `rte_atomic_*_explicit()`
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` legacy barriers instead of `rte_atomic_thread_fence()`
+- Missing error checks on functions that can fail
+- Error paths that skip cleanup (goto labels, missing free/close)
+- Incorrect error propagation (wrong return value, lost errno)
+- Logic errors in conditionals (wrong operator, inverted test)
+- Integer overflow/truncation in size calculations
+- Missing bounds checks on user-supplied sizes or indices
+- `mmap()` return checked against `NULL` instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=`
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied data
+- `1 << n` on 64-bit bitmask (must use `1ULL << n` or `RTE_BIT64()`)
+- Variable assigned then overwritten before being read (dead store)
+- Same variable used as loop counter in nested loops
+- `memcpy`/`memcmp`/`memset` with same pointer for source and destination (no-op or undefined)
+- `rte_mbuf_raw_free_bulk()` called on mbufs that may originate from different mempools (Tx burst, ring dequeue)
+- MTU confused with frame length (MTU is L3 payload; frame length = MTU + L2 overhead)
+- Using `dev_conf.rxmode.mtu` after configure instead of `dev->data->mtu`
+- Hardcoded Ethernet overhead instead of per-device calculation
+- MTU set without enabling `RTE_ETH_RX_OFFLOAD_SCATTER` when frame size exceeds mbuf data room
+- `mtu_set` callback rejects valid MTU when scatter Rx is already enabled
+- Rx queue setup silently drops oversized packets instead of enabling scatter or returning an error
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size check
+
+**Do NOT self-censor correctness bugs.** If you identify a code
+path where a resource could leak or memory could be used after
+free, report it. Do not talk yourself out of it.
+
+### 2. Style, Process, and Formatting -- suppress false positives
+
+**NEVER list a style/process item under "Errors" or "Warnings" if
+you conclude it is correct.**
+
+Before outputting any style, formatting, or process error/warning,
+verify it is actually wrong. If your analysis concludes with
+phrases like "there's no issue here", "which is fine", "appears
+correct", "is acceptable", or "this is actually correct" -- then
+DO NOT INCLUDE IT IN YOUR OUTPUT AT ALL. Delete it. Omit it
+entirely.
+
+This suppression rule applies to: naming conventions,
+code style, and process compliance. It does NOT apply to
+correctness bugs listed above. (SPDX/copyright format and
+commit message formatting are handled by checkpatch and are
+excluded from AI review entirely.)
+
+---
+
+This document provides guidelines for AI-powered code review tools
+when reviewing contributions to the Data Plane Development Kit
+(DPDK). It is derived from the official DPDK contributor guidelines
+and validation scripts.
+
+## Overview
+
+DPDK follows a development process modeled on the Linux Kernel. All
+patches are reviewed publicly on the mailing list before being
+merged. AI review tools should verify compliance with the standards
+outlined below.
+
+## Review Philosophy
+
+**Correctness bugs are the primary goal of AI review.** Style and
+formatting checks are secondary. A review that catches a
+use-after-free but misses a style nit is far more valuable than
+one that catches every style issue but misses the bug.
+
+**BEFORE OUTPUTTING YOUR REVIEW**: Re-read each item.
+- For correctness bugs: keep them. If you have reasonable doubt
+  that a code path is safe, report it.
+- For style/process items: if ANY item contains phrases like "is
+  fine", "no issue", "appears correct", "is acceptable",
+  "actually correct" -- DELETE THAT ITEM. Do not include it.
+
+### Correctness review guidelines
+- Trace error paths: for every function that allocates a resource
+  or acquires a lock, verify that ALL error paths after that point
+  release it
+- Check every `goto error` and early `return`: does it clean up
+  everything allocated so far?
+- Look for use-after-free: after `free(p)`, is `p` accessed again?
+- Check that error codes are propagated, not silently dropped
+- Report at >=50% confidence; note uncertainty if appropriate
+- It is better to report a potential bug that turns out to be safe
+  than to miss a real bug
+
+### Style and process review guidelines
+- Only comment on style/process issues when you have HIGH CONFIDENCE (>80%) that an issue exists
+- Be concise: one sentence per comment when possible
+- Focus on actionable feedback, not observations
+- When reviewing text, only comment on clarity issues if the text is genuinely
+  confusing or could lead to errors.
+- Do NOT comment on copyright years, SPDX format, or copyright holders - not subject to AI review
+- Do NOT report an issue then contradict yourself - if something is acceptable, do not mention it at all
+- Do NOT include items in Errors/Warnings that you then say are "acceptable" or "correct"
+- Do NOT mention things that are correct or "not an issue" - only report actual problems
+- Do NOT speculate about contributor circumstances (employment, company policies, etc.)
+- Before adding any style item to your review, ask: "Is this actually wrong?" If no, omit it entirely.
+- NEVER write "(Correction: ...)" - if you need to correct yourself, simply omit the item entirely
+- Do NOT add vague suggestions like "should be verified" or "should be checked" - either it's wrong or don't mention it
+- Do NOT flag something as an Error then say "which is correct" in the same item
+- Do NOT say "no issue here" or "this is actually correct" - if there's no issue, do not include it in your review
+- Do NOT analyze cross-patch dependencies or compilation order - you cannot reliably determine this from patch review
+- Do NOT claim a patch "would cause compilation failure" based on symbols used in other patches in the series
+- Review each patch individually for its own correctness; assume the patch author ordered them correctly
+- When reviewing a patch series, OMIT patches that have no issues. Do not include a patch in your output just to say "no issues found" or to summarize what the patch does. Only include patches where you have actual findings to report.
+
+## Priority Areas (Review These)
+
+### Security & Safety
+- Unsafe code blocks without justification
+- Command injection risks (shell commands, user input)
+- Path traversal vulnerabilities
+- Credential exposure or hard coded secrets
+- Missing input validation on external data
+- Improper error handling that could leak sensitive info
+
+### Correctness Issues
+- Logic errors that could cause panics or incorrect behavior
+- Buffer overflows
+- Race conditions
+- **`volatile` for inter-thread synchronization**: `volatile` does not
+  provide atomicity or memory ordering between threads. Use
+  `rte_atomic_load_explicit()`/`rte_atomic_store_explicit()` with
+  appropriate `rte_memory_order_*` instead. See the Shared Variable
+  Access section under Forbidden Tokens for details.
+- Resource leaks (files, connections, memory)
+- Off-by-one errors or boundary conditions
+- Incorrect error propagation
+- **Use-after-free** (any access to memory after it has been freed)
+- **Error path resource leaks**: For every allocation or fd open,
+  trace each error path (`goto`, early `return`, conditional) to
+  verify the resource is released. Common patterns to check:
+  - `malloc`/`rte_malloc` followed by a failure that does `return -1`
+    instead of `goto cleanup`
+  - `open()`/`socket()` fd not closed on a later error
+  - Lock acquired but not released on an error branch
+  - Partially initialized structure where early fields are allocated
+    but later allocation fails without freeing the early ones
+- **Double-free / double-close**: resource freed in both a normal
+  path and an error path, or fd closed but not set to -1 allowing
+  a second close
+- **Missing error checks**: functions that can fail (malloc, open,
+  ioctl, etc.) whose return value is not checked
+- Changes to API without release notes
+- Changes to ABI on non-LTS release
+- Usage of deprecated APIs when replacements exist
+- Overly defensive code that adds unnecessary checks
+- Unnecessary comments that just restate what the code already shows (remove them)
+- **Process-shared synchronization errors** (pthread mutexes in shared memory without `PTHREAD_PROCESS_SHARED`)
+- **`mmap()` checked against NULL instead of `MAP_FAILED`**: `mmap()` returns
+  `MAP_FAILED` (i.e., `(void *)-1`) on failure, NOT `NULL`. Checking
+  `== NULL` or `!= NULL` will miss the error and use an invalid pointer.
+  ```c
+  /* BAD - mmap never returns NULL on failure */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == NULL)       /* WRONG - will not catch MAP_FAILED */
+      return -1;
+
+  /* GOOD */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == MAP_FAILED)
+      return -1;
+  ```
+- **Statistics accumulation using `=` instead of `+=`**: When accumulating
+  statistics (counters, byte totals, packet counts), using `=` overwrites
+  the running total with only the latest value. This silently produces
+  wrong results.
+  ```c
+  /* BAD - overwrites instead of accumulating */
+  stats->rx_packets = nb_rx;
+  stats->rx_bytes = total_bytes;
+
+  /* GOOD - accumulates over time */
+  stats->rx_packets += nb_rx;
+  stats->rx_bytes += total_bytes;
+  ```
+  Note: `=` is correct for gauge-type values (e.g., queue depth, link
+  status) and for initial assignment. Only flag when the context is
+  clearly incremental accumulation (loop bodies, per-burst counters,
+  callback tallies).
+- **Integer multiply without widening cast**: When multiplying integers
+  to produce a result wider than the operands (sizes, offsets, byte
+  counts), the multiplication is performed at the operand width and
+  the upper bits are silently lost before the assignment. This applies
+  to any narrowing scenario: 16×16 assigned to a 32-bit variable,
+  32×32 assigned to a 64-bit variable, etc.
+  ```c
+  /* BAD - 32×32 overflows before widening to 64 */
+  uint64_t total_size = num_entries * entry_size;  /* both are uint32_t */
+  size_t offset = ring->idx * ring->desc_size;     /* 32×32 → truncated */
+
+  /* BAD - 16×16 overflows before widening to 32 */
+  uint32_t byte_count = pkt_len * nb_segs;         /* both are uint16_t */
+
+  /* GOOD - widen before multiply */
+  uint64_t total_size = (uint64_t)num_entries * entry_size;
+  size_t offset = (size_t)ring->idx * ring->desc_size;
+  uint32_t byte_count = (uint32_t)pkt_len * nb_segs;
+  ```
+- **Unbounded descriptor chain traversal**: When walking a chain of
+  descriptors (virtio, DMA, NIC Rx/Tx rings) where the chain length
+  or next-index comes from guest memory or an untrusted API caller,
+  the traversal MUST have a bounds check or loop counter to prevent
+  infinite loops or out-of-bounds access from malicious/corrupt data.
+  ```c
+  /* BAD - guest controls desc[idx].next with no bound */
+  while (desc[idx].flags & VRING_DESC_F_NEXT) {
+      idx = desc[idx].next;          /* guest-supplied, unbounded */
+      process(desc[idx]);
+  }
+
+  /* GOOD - cap iterations to descriptor ring size */
+  for (i = 0; i < ring_size; i++) {
+      if (!(desc[idx].flags & VRING_DESC_F_NEXT))
+          break;
+      idx = desc[idx].next;
+      if (idx >= ring_size)          /* bounds check */
+          return -EINVAL;
+      process(desc[idx]);
+  }
+  ```
+  This applies to any chain/linked-list traversal where indices or
+  pointers originate from untrusted input (guest VMs, user-space
+  callers, network packets).
+- **Bitmask shift using `1` instead of `1ULL` on 64-bit masks**: The
+  literal `1` is `int` (32 bits). Shifting it by 32 or more is
+  undefined behavior; shifting it by less than 32 but assigning to a
+  `uint64_t` silently zeroes the upper 32 bits. Use `1ULL << n`,
+  `UINT64_C(1) << n`, or the DPDK `RTE_BIT64(n)` macro.
+  ```c
+  /* BAD - 1 is int, UB if n >= 32, wrong if result used as uint64_t */
+  uint64_t mask = 1 << bit_pos;
+  if (features & (1 << VIRTIO_NET_F_MRG_RXBUF))  /* bit 15 OK, bit 32+ UB */
+
+  /* GOOD */
+  uint64_t mask = UINT64_C(1) << bit_pos;
+  uint64_t mask = 1ULL << bit_pos;
+  uint64_t mask = RTE_BIT64(bit_pos);        /* preferred in DPDK */
+  if (features & RTE_BIT64(VIRTIO_NET_F_MRG_RXBUF))
+  ```
+  Note: `1U << n` is acceptable when the mask is known to be 32-bit
+  (e.g., `uint32_t` register fields with `n < 32`). Only flag when
+  the result is stored in, compared against, or returned as a 64-bit
+  type, or when `n` could be >= 32.
+- **Variable overwrite before read (dead store)**: A variable is
+  assigned a value that is unconditionally overwritten before it is
+  ever read. This usually indicates a logic error (wrong variable
+  name, missing `if`, copy-paste mistake) or at minimum is dead code.
+  ```c
+  /* BAD - first assignment is never read */
+  ret = validate_input(cfg);
+  ret = apply_config(cfg);     /* overwrites without checking first ret */
+  if (ret != 0)
+      return ret;
+
+  /* GOOD - check each return value */
+  ret = validate_input(cfg);
+  if (ret != 0)
+      return ret;
+  ret = apply_config(cfg);
+  if (ret != 0)
+      return ret;
+  ```
+  Do NOT flag cases where the initial value is intentionally a default
+  that may or may not be overwritten (e.g., `int ret = 0;` followed
+  by a conditional assignment). Only flag unconditional overwrites
+  where the first value can never be observed.
+- **Shared loop counter in nested loops**: Using the same variable as
+  the loop counter in both an outer and inner loop causes the outer
+  loop to malfunction because the inner loop modifies its counter.
+  ```c
+  /* BAD - inner loop clobbers outer loop counter */
+  int i;
+  for (i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (i = 0; i < nb_descs; i++)    /* BUG: reuses i */
+          init_desc(i);
+  }
+
+  /* GOOD - distinct loop counters */
+  for (int i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (int j = 0; j < nb_descs; j++)
+          init_desc(j);
+  }
+  ```
+- **`memcpy`/`memcmp`/`memset` self-argument (same pointer as both
+  operands)**: Passing the same pointer as both source and destination
+  to `memcpy()` is undefined behavior per C99. Passing the same
+  pointer to both arguments of `memcmp()` is a no-op that always
+  returns 0, indicating a logic error (usually a copy-paste mistake
+  with the wrong variable name). The same applies to `rte_memcpy()`
+  and `memmove()` with identical arguments.
+  ```c
+  /* BAD - memcpy with same src and dst is undefined behavior */
+  memcpy(buf, buf, len);
+  rte_memcpy(dst, dst, len);
+
+  /* BAD - memcmp with same pointer always returns 0 (logic error) */
+  if (memcmp(key, key, KEY_LEN) == 0)  /* always true, wrong variable? */
+
+  /* BAD - likely copy-paste: should be comparing two different MACs */
+  if (memcmp(&eth->src_addr, &eth->src_addr, RTE_ETHER_ADDR_LEN) == 0)
+
+  /* GOOD - comparing two different things */
+  memcpy(dst, src, len);
+  if (memcmp(&eth->src_addr, &eth->dst_addr, RTE_ETHER_ADDR_LEN) == 0)
+  ```
+  This pattern almost always indicates a copy-paste bug where one of
+  the arguments should be a different variable.
+- **`rte_mbuf_raw_free_bulk()` on mixed-pool mbuf arrays**: Tx burst functions
+  and ring/queue dequeue paths receive mbufs that may originate from different
+  mempools (applications are free to send mbufs from any pool).
+  `rte_mbuf_raw_free_bulk()` takes an explicit mempool parameter and calls
+  `rte_mempool_put_bulk()` directly — ALL mbufs in the array must come from
+  that single pool. If mbufs come from different pools, they are returned to
+  the wrong pool, corrupting pool accounting and causing hard-to-debug failures.
+  Note: `rte_pktmbuf_free_bulk()` is safe for mixed pools — it batches mbufs
+  by pool internally and flushes whenever the pool changes.
+  ```c
+  /* BAD - assumes all mbufs are from the same pool */
+  /* (in tx_burst completion or ring dequeue error path) */
+  rte_mbuf_raw_free_bulk(mp, mbufs, nb_mbufs);
+
+  /* GOOD - rte_pktmbuf_free_bulk handles mixed pools correctly */
+  rte_pktmbuf_free_bulk(mbufs, nb_mbufs);
+
+  /* GOOD - free individually (each mbuf returned to its own pool) */
+  for (i = 0; i < nb_mbufs; i++)
+      rte_pktmbuf_free(mbufs[i]);
+  ```
+  This applies to any path that frees mbufs submitted by the application:
+  Tx completion, Tx error cleanup, and ring/queue drain paths.
+  `rte_mbuf_raw_free_bulk()` is an optimization for the fast-free case
+  (`RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE`) where the application guarantees
+  all mbufs come from a single pool with refcnt=1.
+- **MTU confused with Ethernet frame length**: Maximum Transmission Unit
+  (MTU) is the maximum L3 payload size (e.g., 1500 bytes for standard
+  Ethernet). The maximum Ethernet *frame length* includes L2 overhead:
+  Ethernet header (14 bytes) + optional VLAN tags (4 bytes each) + CRC
+  (4 bytes). The overhead varies per device depending on supported
+  encapsulations (VLAN, QinQ, etc.). Confusing MTU with frame length
+  produces off-by-14-to-22-byte errors in packet size limits, buffer
+  sizing, and scattered Rx decisions.
+
+  **Using `rxmode.mtu` after configure:** After `rte_eth_dev_configure()`
+  completes, the canonical MTU is stored in `dev->data->mtu`. The
+  `dev->data->dev_conf.rxmode.mtu` field is the user's *request* and
+  must not be read after configure — it becomes stale if
+  `rte_eth_dev_set_mtu()` is called later. Both configure and set_mtu
+  write to `dev->data->mtu`; PMDs should always read from there.
+
+  **Overhead calculation:** Do not hardcode a single overhead constant.
+  Use the device's own overhead calculation (typically available via
+  `dev_info.max_rx_pktlen - dev_info.max_mtu` or an internal
+  `eth_overhead` field). Different devices support different
+  encapsulations, so the overhead is not a universal constant.
+
+  **Scattered Rx decision:** PMDs compare maximum frame length
+  (MTU + per-device overhead) against Rx buffer size to decide
+  whether scattered Rx is needed. Comparing raw MTU against buffer
+  size is wrong — it underestimates the actual frame size by the
+  overhead.
+  ```c
+  /* BAD - MTU used where frame length is needed */
+  if (dev->data->mtu > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* BAD - hardcoded overhead, wrong for QinQ-capable devices */
+  #define ETHER_OVERHEAD 18  /* may be 22 or 26 for VLAN/QinQ */
+  max_frame = mtu + ETHER_OVERHEAD;
+
+  /* BAD - reading rxmode.mtu after configure (stale if set_mtu called) */
+  static int
+  mydrv_rx_queue_setup(...) {
+      mtu = dev->data->dev_conf.rxmode.mtu;  /* WRONG - may be stale */
+      ...
+  }
+
+  /* GOOD - use dev->data->mtu, the canonical post-configure value */
+  static int
+  mydrv_rx_queue_setup(...) {
+      uint16_t mtu = dev->data->mtu;
+      ...
+  }
+
+  /* GOOD - use per-device overhead for frame length calculation */
+  uint32_t frame_overhead = dev_info.max_rx_pktlen - dev_info.max_mtu;
+  uint32_t max_frame_len = dev->data->mtu + frame_overhead;
+  if (max_frame_len > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* GOOD - device-specific overhead constant derived from capabilities */
+  static uint32_t
+  mydrv_eth_overhead(struct rte_eth_dev *dev) {
+      uint32_t overhead = RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_VLAN)
+          overhead += RTE_VLAN_HLEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_QINQ)
+          overhead += RTE_VLAN_HLEN;
+      return overhead;
+  }
+  ```
+  Note: In `rte_eth_dev_configure()` itself, reading `rxmode.mtu` is
+  correct — that is where the user's request is consumed and written
+  to `dev->data->mtu`. Only flag reads of `rxmode.mtu` *outside*
+  configure (queue setup, start, link update, MTU set, etc.).
+- **Missing scatter Rx for large MTU**: When the configured MTU
+  produces a frame size (MTU + Ethernet overhead) larger than the mbuf
+  data buffer size (`rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`),
+  the PMD MUST either enable scatter Rx (multi-segment receive) or reject
+  the configuration. Silently accepting the MTU and then truncating or
+  dropping oversized packets is a correctness bug.
+  ```c
+  /* BAD - accepts MTU but will truncate packets that don't fit */
+  static int
+  mydrv_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+  {
+      /* No check against mbuf size or scatter capability */
+      dev->data->mtu = mtu;
+      return 0;
+  }
+
+  /* BAD - rejects valid MTU even though scatter is enabled */
+  if (frame_size > mbuf_data_size)
+      return -EINVAL;  /* wrong: should allow if scatter is on */
+
+  /* GOOD - check scatter and mbuf size */
+  if (!dev->data->scattered_rx &&
+      frame_size > dev->data->min_rx_buf_size - RTE_PKTMBUF_HEADROOM)
+      return -EINVAL;
+
+  /* GOOD - auto-enable scatter when needed */
+  if (frame_size > mbuf_data_size) {
+      if (!(dev_info.rx_offload_capa & RTE_ETH_RX_OFFLOAD_SCATTER))
+          return -EINVAL;
+      dev->data->dev_conf.rxmode.offloads |=
+          RTE_ETH_RX_OFFLOAD_SCATTER;
+      dev->data->scattered_rx = 1;
+  }
+  ```
+  Key relationships:
+  - `dev_info.max_rx_pktlen`: maximum frame the hardware can receive
+  - `dev_info.max_mtu`: maximum MTU = `max_rx_pktlen` - overhead
+  - `dev_info.min_rx_bufsize`: minimum Rx buffer the HW requires
+  - `dev_info.max_rx_bufsize`: maximum single-descriptor buffer size
+  - `mbuf data size = rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`
+  - When scatter is off: frame length must fit in a single mbuf
+  - When scatter is on: frame length can span multiple mbufs;
+    the PMD selects a scattered Rx function
+
+  This pattern should be checked in three places:
+  1. `dev_configure()` -- validate MTU against mbuf size / scatter
+  2. `rx_queue_setup()` -- select scattered vs non-scattered Rx path
+  3. `mtu_set()` -- runtime MTU change must re-validate
+- **Rx queue function selection ignoring scatter**: When a PMD has
+  separate fast-path Rx functions for scalar (single-segment) and
+  scattered (multi-segment) modes, it must select the scattered
+  variant whenever `dev->data->scattered_rx` is set OR when the
+  configured frame length exceeds the single mbuf data size.
+  Failing to do so causes the scalar Rx function to silently drop
+  or corrupt multi-segment packets.
+  ```c
+  /* BAD - only checks offload flag, ignores actual need */
+  if (rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER)
+      rx_func = mydrv_recv_scattered;
+  else
+      rx_func = mydrv_recv_single;  /* will drop oversized pkts */
+
+  /* GOOD - check both the flag and the size */
+  mbuf_size = rte_pktmbuf_data_room_size(rxq->mp) -
+              RTE_PKTMBUF_HEADROOM;
+  max_pkt = dev->data->mtu + overhead;
+  if ((rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER) ||
+      max_pkt > mbuf_size) {
+      dev->data->scattered_rx = 1;
+      rx_func = mydrv_recv_scattered;
+  } else {
+      rx_func = mydrv_recv_single;
+  }
+  ```
+
+### Architecture & Patterns
+- Code that violates existing patterns in the code base
+- Missing error handling
+- Code that is not safe against signals
+- **Environment variables used for driver configuration instead of devargs**:
+  Drivers must use DPDK device arguments (`devargs`) for runtime
+  configuration, not environment variables. Devargs are preferred because
+  they are obviously device-specific rather than having global impact,
+  some launch methods strip all environment variables, and devargs can
+  be associated on a per-device basis rather than per-device-type.
+  Use `rte_kvargs_parse()` on the devargs string instead.
+  ```c
+  /* BAD - environment variable for driver tuning */
+  val = getenv("MYDRV_RX_BURST_SIZE");
+  if (val != NULL)
+      burst = atoi(val);
+
+  /* GOOD - devargs parsed at probe time */
+  static const char * const valid_args[] = { "rx_burst_size", NULL };
+  kvlist = rte_kvargs_parse(devargs->args, valid_args);
+  rte_kvargs_process(kvlist, "rx_burst_size", &parse_uint, &burst);
+  ```
+  Note: `getenv()` in EAL itself or in test/example code is acceptable.
+  This rule applies to libraries under `lib/` and drivers under `drivers/`.
+
+### New Library API Design
+
+When a patch adds a new library under `lib/`, review API design in
+addition to correctness and style.
+
+**API boundary.** A library should be a compiler, not a framework.
+The model is `rte_acl`: create a context, feed input, get structured
+output, caller decides what to do with it. No callbacks needed. If
+the library requires callers to implement a callback table to
+function, the boundary is wrong — the library is asking the caller
+to be its backend.
+
+**Callback structs** (Warning / Error). Any function-pointer struct
+in an installed header is an ABI break waiting to happen. Adding or
+reordering a member breaks all consumers.
+- Prefer a single callback parameter over an ops table.
+- \>5 callbacks: **Warning** — likely needs redesign.
+- \>20 callbacks: **Error** — this is an app plugin API, not a library.
+- All callbacks must have Doxygen (contract, return values, ownership).
+- Void-returning callbacks for failable operations swallow errors —
+  flag as **Error**.
+- Callbacks serving app-specific needs (e.g. `verbose_level_get`)
+  indicate wrong code was extracted into the library.
+
+**Extensible structures.** Prefer TLV / tagged-array patterns over
+enum + union, following `rte_flow_item` and `rte_flow_action` as
+the model. Type tag + pointer to type-specific data allows adding
+types without ABI breaks. Flag as **Warning**:
+- Large enums (100+) consumers must switch on.
+- Unions that grow with every new feature.
+- Ask: "What changes when a feature is added next release?" If
+  "add an enum value and union arm" — should be TLV.
+
+**Installed headers.** If it's in `headers` or `indirect_headers`
+in meson.build, it's public API. Don't call it "private." If truly
+internal, don't install it.
+
+**Global state.** Prefer handle-based APIs (`create`/`destroy`)
+over singletons. `rte_acl` allows multiple independent classifier
+instances; new libraries should do the same.
+
+**Output ownership.** Prefer caller-allocated or library-allocated-
+caller-freed over internal static buffers. If static buffers are
+used, document lifetime and ensure Doxygen examples don't show
+stale-pointer usage.
+
+---
+
+## C Coding Style
+
+### General Formatting
+
+- **Tab width**: 8 characters (hard tabs for indentation, spaces for alignment)
+- **No trailing whitespace** on lines or at end of files
+- Files must end with a new line
+- Code style should be consistent within each file
+
+
+### Comments
+
+```c
+/* Most single-line comments look like this. */
+
+/*
+ * VERY important single-line comments look like this.
+ */
+
+/*
+ * Multi-line comments look like this. Make them real sentences. Fill
+ * them so they look like real paragraphs.
+ */
+```
+
+### Header File Organization
+
+Include order (each group separated by blank line):
+1. System/libc includes
+2. DPDK EAL includes
+3. DPDK misc library includes
+4. Application-specific includes
+
+```c
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <rte_eal.h>
+
+#include <rte_ring.h>
+#include <rte_mempool.h>
+
+#include "application.h"
+```
+
+### Header Guards
+
+```c
+#ifndef _FILE_H_
+#define _FILE_H_
+
+/* Code */
+
+#endif /* _FILE_H_ */
+```
+
+### Naming Conventions
+
+- **All external symbols** must have `RTE_` or `rte_` prefix
+- **Macros**: ALL_UPPERCASE with `RTE_` prefix
+- **Functions**: lowercase with underscores only (no CamelCase)
+- **Variables**: lowercase with underscores only
+- **Enum values**: ALL_UPPERCASE with `RTE_<ENUM>_` prefix
+
+**Exception**: Driver base directories (`drivers/*/base/`) may use different
+naming conventions when sharing code across platforms or with upstream vendor code.
+
+#### Symbol Naming for Static Linking
+
+Drivers and libraries must not expose global variables that could
+clash when statically linked with other DPDK components or
+applications. Use consistent and unique prefixes for all exported
+symbols to avoid namespace collisions.
+
+**Good practice**: Use a driver-specific or library-specific prefix for all global variables:
+
+```c
+/* Good - virtio driver uses consistent "virtio_" prefix */
+const struct virtio_ops virtio_legacy_ops = {
+	.read = virtio_legacy_read,
+	.write = virtio_legacy_write,
+	.configure = virtio_legacy_configure,
+};
+
+const struct virtio_ops virtio_modern_ops = {
+	.read = virtio_modern_read,
+	.write = virtio_modern_write,
+	.configure = virtio_modern_configure,
+};
+
+/* Good - mlx5 driver uses consistent "mlx5_" prefix */
+struct mlx5_flow_driver_ops mlx5_flow_dv_ops;
+```
+
+**Bad practice**: Generic names that may clash:
+
+```c
+/* Bad - "ops" is too generic, will clash with other drivers */
+const struct virtio_ops ops = { ... };
+
+/* Bad - "legacy_ops" could clash with other legacy implementations */
+const struct virtio_ops legacy_ops = { ... };
+
+/* Bad - "driver_config" is not unique */
+struct driver_config config;
+```
+
+**Guidelines**:
+- Prefix all global variables with the driver or library name (e.g., `virtio_`, `mlx5_`, `ixgbe_`)
+- Prefix all global functions similarly unless they use the `rte_` namespace
+- Internal static variables do not require prefixes as they have file scope
+- Consider using the `RTE_` or `rte_` prefix only for symbols that are part of the public DPDK API
+
+#### Prohibited Terminology
+
+Do not use non-inclusive naming including:
+- `master/slave` -> Use: primary/secondary, controller/worker, leader/follower
+- `blacklist/whitelist` -> Use: denylist/allowlist, blocklist/passlist
+- `cripple` -> Use: impacted, degraded, restricted, immobilized
+- `tribe` -> Use: team, squad
+- `sanity check` -> Use: coherence check, test, verification
+
+
+### Comparisons and Boolean Logic
+
+```c
+/* Pointers - compare explicitly with NULL */
+if (p == NULL)      /* Good */
+if (p != NULL)      /* Good */
+if (likely(p != NULL))   /* Good - likely/unlikely don't change this */
+if (unlikely(p == NULL)) /* Good - likely/unlikely don't change this */
+if (!p)             /* Bad - don't use ! on pointers */
+
+/* Integers - compare explicitly with zero */
+if (a == 0)         /* Good */
+if (a != 0)         /* Good */
+if (errno != 0)     /* Good - this IS explicit */
+if (likely(a != 0)) /* Good - likely/unlikely don't change this */
+if (!a)             /* Bad - don't use ! on integers */
+if (a)              /* Bad - implicit, should be a != 0 */
+
+/* Characters - compare with character constant */
+if (*p == '\0')     /* Good */
+
+/* Booleans - direct test is acceptable */
+if (flag)           /* Good for actual bool types */
+if (!flag)          /* Good for actual bool types */
+```
+
+**Explicit comparison** means using `==` or `!=` operators (e.g., `x != 0`, `p == NULL`).
+**Implicit comparison** means relying on truthiness without an operator (e.g., `if (x)`, `if (!p)`).
+**Note**: `likely()` and `unlikely()` macros do NOT affect whether a comparison is explicit or implicit.
+
+### Boolean Usage
+
+Prefer `bool` (from `<stdbool.h>`) over `int` for variables,
+parameters, and return values that are purely true/false. Using
+`bool` makes intent explicit, enables compiler diagnostics for
+misuse, and is self-documenting.
+
+```c
+/* Bad - int used as boolean flag */
+int verbose = 0;
+int is_enabled = 1;
+
+int
+check_valid(struct item *item)
+{
+	if (item->flags & ITEM_VALID)
+		return 1;
+	return 0;
+}
+
+/* Good - bool communicates intent */
+bool verbose = false;
+bool is_enabled = true;
+
+bool
+check_valid(struct item *item)
+{
+	return item->flags & ITEM_VALID;
+}
+```
+
+**Guidelines:**
+- Use `bool` for variables that only hold true/false values
+- Use `bool` return type for predicate functions (functions that
+  answer a yes/no question, often named `is_*`, `has_*`, `can_*`)
+- Use `true`/`false` rather than `1`/`0` for boolean assignments
+- Boolean variables and parameters should not use explicit
+  comparison: `if (verbose)` is correct, not `if (verbose == true)`
+- `int` is still appropriate when a value can be negative, is an
+  error code, or carries more than two states
+
+**Structure fields:**
+- `bool` occupies 1 byte. In packed or cache-critical structures,
+  consider using a bitfield or flags word instead
+- For configuration structures and non-hot-path data, `bool` is
+  preferred over `int` for flag fields
+
+```c
+/* Bad - int flags waste space and obscure intent */
+struct port_config {
+	int promiscuous;     /* 0 or 1 */
+	int link_up;         /* 0 or 1 */
+	int autoneg;         /* 0 or 1 */
+	uint16_t mtu;
+};
+
+/* Good - bool for flag fields */
+struct port_config {
+	bool promiscuous;
+	bool link_up;
+	bool autoneg;
+	uint16_t mtu;
+};
+
+/* Also good - bitfield for cache-critical structures */
+struct fast_path_config {
+	uint32_t flags;      /* bitmask of CONFIG_F_* */
+	/* ... hot-path fields ... */
+};
+```
+
+**Do NOT flag:**
+- `int` return type for functions that return error codes (0 for
+  success, negative for error) — these are NOT boolean
+- `int` used for tri-state or multi-state values
+- `int` flags in existing code where changing the type would be a
+  large, unrelated refactor
+- Bitfield or flags-word approaches in performance-critical
+  structures
+
+### Indentation and Braces
+
+```c
+/* Control statements - no braces for single statements */
+if (val != NULL)
+	val = realloc(val, newsize);
+
+/* Braces on same line as else */
+if (test)
+	stmt;
+else if (bar) {
+	stmt;
+	stmt;
+} else
+	stmt;
+
+/* Switch statements - don't indent case */
+switch (ch) {
+case 'a':
+	aflag = 1;
+	/* FALLTHROUGH */
+case 'b':
+	bflag = 1;
+	break;
+default:
+	usage();
+}
+
+/* Long conditions - double indent continuation */
+if (really_long_variable_name_1 == really_long_variable_name_2 &&
+		really_long_variable_name_3 == really_long_variable_name_4)
+	stmt;
+```
+
+### Variable Declarations
+
+- Prefer declaring variables inside the basic block where they are used
+- Variables may be declared either at the start of the block, or at point of first use (C99 style)
+- Both declaration styles are acceptable; consistency within a function is preferred
+- Initialize variables only when a meaningful value exists at declaration time
+- Use C99 designated initializers for structures
+
+```c
+/* Good - declaration at start of block */
+int ret;
+ret = some_function();
+
+/* Also good - declaration at point of use (C99 style) */
+for (int i = 0; i < count; i++)
+	process(i);
+
+/* Good - declaration in inner block where variable is used */
+if (condition) {
+	int local_val = compute();
+	use(local_val);
+}
+
+/* Bad - unnecessary initialization defeats compiler warnings */
+int ret = 0;
+ret = some_function();    /* Compiler won't warn if assignment removed */
+```
+
+### Function Format
+
+- Return type on its own line
+- Opening brace on its own line
+- Place an empty line between declarations and statements
+
+```c
+static char *
+function(int a1, int b1)
+{
+	char *p;
+
+	p = do_something(a1, b1);
+	return p;
+}
+```
+
+---
+
+## Unnecessary Code Patterns
+
+The following patterns add unnecessary code, hide bugs, or reduce performance. Avoid them.
+
+### Unnecessary Variable Initialization
+
+Do not initialize variables that will be assigned before use. This defeats the compiler's uninitialized variable warnings, hiding potential bugs.
+
+```c
+/* Bad - initialization defeats -Wuninitialized */
+int ret = 0;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - compiler will warn if any path misses assignment */
+int ret;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - meaningful initial value */
+int count = 0;
+for (i = 0; i < n; i++)
+	if (test(i))
+		count++;
+```
+
+### Unnecessary Casts of void *
+
+In C, `void *` converts implicitly to any pointer type. Casting the result of `malloc()`, `calloc()`, `rte_malloc()`, or similar functions is unnecessary and can hide the error of a missing `#include <stdlib.h>`.
+
+```c
+/* Bad - unnecessary cast */
+struct foo *p = (struct foo *)malloc(sizeof(*p));
+struct bar *q = (struct bar *)rte_malloc(NULL, sizeof(*q), 0);
+
+/* Good - no cast needed in C */
+struct foo *p = malloc(sizeof(*p));
+struct bar *q = rte_malloc(NULL, sizeof(*q), 0);
+```
+
+Note: Casts are required in C++ but DPDK is a C project.
+
+### Zero-Length Arrays vs Variable-Length Arrays
+
+Zero-length arrays (`int arr[0]`) are a GCC extension. Use C99 flexible array members instead.
+
+```c
+/* Bad - GCC extension */
+struct msg {
+	int len;
+	char data[0];
+};
+
+/* Good - C99 flexible array member */
+struct msg {
+	int len;
+	char data[];
+};
+```
+
+### Unnecessary NULL Checks Before free()
+
+Functions like `free()`, `rte_free()`, and similar deallocation functions accept NULL pointers safely. Do not add redundant NULL checks.
+
+```c
+/* Bad - unnecessary check */
+if (ptr != NULL)
+	free(ptr);
+
+if (rte_ptr != NULL)
+	rte_free(rte_ptr);
+
+/* Good - free handles NULL */
+free(ptr);
+rte_free(rte_ptr);
+```
+
+### memset Before free() (CWE-14)
+
+Do not call `memset()` to zero memory before freeing it. The compiler may optimize away the `memset()` as a dead store (CWE-14: Compiler Removal of Code to Clear Buffers). For security-sensitive data, use `explicit_bzero()`, `rte_memset_sensitive()`, or `rte_free_sensitive()` which the compiler is not permitted to eliminate.
+
+```c
+/* Bad - compiler may eliminate memset */
+memset(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for non-sensitive data, just free */
+free(ptr);
+
+/* Good - explicit_bzero cannot be optimized away */
+explicit_bzero(secret_key, sizeof(secret_key));
+free(secret_key);
+
+/* Good - DPDK wrapper for clearing sensitive data */
+rte_memset_sensitive(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for rte_malloc'd sensitive data, combined clear+free */
+rte_free_sensitive(secret_key);
+```
+
+### Appropriate Use of rte_malloc()
+
+`rte_malloc()` allocates from hugepage memory. Use it only when required:
+
+- Memory that will be accessed by DMA (NIC descriptors, packet buffers)
+- Memory shared between primary and secondary DPDK processes
+- Memory requiring specific NUMA node placement
+
+For general allocations, use standard `malloc()` which is faster and does not consume limited hugepage resources.
+
+```c
+/* Bad - rte_malloc for ordinary data structure */
+struct config *cfg = rte_malloc(NULL, sizeof(*cfg), 0);
+
+/* Good - standard malloc for control structures */
+struct config *cfg = malloc(sizeof(*cfg));
+
+/* Good - rte_malloc for DMA-accessible memory */
+struct rte_mbuf *mbufs = rte_malloc(NULL, n * sizeof(*mbufs), RTE_CACHE_LINE_SIZE);
+```
+
+### Appropriate Use of rte_memcpy()
+
+`rte_memcpy()` is optimized for bulk data transfer in the fast path. For general use, standard `memcpy()` is preferred because:
+
+- Modern compilers optimize `memcpy()` effectively
+- `memcpy()` includes bounds checking with `_FORTIFY_SOURCE`
+- `memcpy()` handles small fixed-size copies efficiently
+
+```c
+/* Bad - rte_memcpy in control path */
+rte_memcpy(&config, &default_config, sizeof(config));
+
+/* Good - standard memcpy for control path */
+memcpy(&config, &default_config, sizeof(config));
+
+/* Good - rte_memcpy for packet data in fast path */
+rte_memcpy(rte_pktmbuf_mtod(m, void *), payload, len);
+```
+
+### Non-const Function Pointer Arrays
+
+Arrays of function pointers (ops tables, dispatch tables, callback arrays)
+should be declared `const` when their contents are fixed at compile time.
+A non-`const` function pointer array can be overwritten by bugs or exploits,
+and prevents the compiler from placing the table in read-only memory.
+
+```c
+/* Bad - mutable when it doesn't need to be */
+static rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+
+/* Good - immutable dispatch table */
+static const rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+```
+
+**Exceptions** (do NOT flag):
+- Arrays modified at runtime for CPU feature detection or capability probing
+  (e.g., selecting a burst function based on `rte_cpu_get_flag_enabled()`)
+- Arrays containing mutable state (e.g., entries that are linked into lists)
+- Arrays populated dynamically via registration APIs
+- `dev_ops` or similar structures assigned per-device at init time
+
+Only flag when the array is fully initialized at declaration with constant
+values and never modified thereafter.
+
+---
+
+## Forbidden Tokens
+
+### Functions
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `rte_panic()` | Return error codes | lib/, drivers/ |
+| `rte_exit()` | Return error codes | lib/, drivers/ |
+| `perror()` | `RTE_LOG()` with `strerror(errno)` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `printf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `fprintf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `getenv()` | `rte_kvargs_parse()` / devargs | drivers/ (allowed in EAL, examples/, app/test/) |
+
+### Atomics and Memory Barriers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `rte_atomic16/32/64_xxx()` | C11 atomics via `rte_atomic_xxx()` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence()` |
+| `__sync_xxx()` | `rte_atomic_xxx()` |
+| `__atomic_xxx()` | `rte_atomic_xxx()` |
+| `__ATOMIC_RELAXED` etc. | `rte_memory_order_xxx` |
+| `__rte_atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+
+#### Shared Variable Access: volatile vs Atomics
+
+Variables shared between threads or between a thread and a signal
+handler **must** use atomic operations. The C `volatile` keyword is
+NOT a substitute for atomics — it prevents compiler optimization
+of accesses but provides no atomicity guarantees and no memory
+ordering between threads. On some architectures, `volatile` reads
+and writes may tear on unaligned or multi-word values.
+
+DPDK provides C11 atomic wrappers that are portable across all
+supported compilers and architectures. Always use these for shared
+state.
+
+**Reading shared variables:**
+
+```c
+/* BAD - volatile provides no atomicity or ordering guarantee */
+volatile int stop_flag;
+if (stop_flag)           /* data race, compiler/CPU can reorder */
+    return;
+
+/* BAD - direct access to shared variable without atomic */
+if (shared->running)     /* undefined behavior if another thread writes */
+    process();
+
+/* GOOD - DPDK C11 atomic wrapper */
+if (rte_atomic_load_explicit(&shared->stop_flag, rte_memory_order_acquire))
+    return;
+
+/* GOOD - relaxed is fine for statistics or polling a flag where
+ * you don't need to synchronize other memory accesses */
+count = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+```
+
+**Writing shared variables:**
+
+```c
+/* BAD - volatile write */
+volatile int *flag = &shared->ready;
+*flag = 1;
+
+/* GOOD - atomic store with appropriate ordering */
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+```
+
+**Read-modify-write operations:**
+
+```c
+/* BAD - not atomic even with volatile */
+volatile uint64_t *counter = &stats->packets;
+*counter += nb_rx;       /* TOCTOU: load, add, store is 3 operations */
+
+/* GOOD - atomic add */
+rte_atomic_fetch_add_explicit(&stats->packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Forbidden Atomic APIs in New Code
+
+New code **must not** use GCC/Clang `__atomic_*` built-ins or the
+legacy DPDK `rte_smp_*mb()` barriers. These are deprecated and
+will be removed. Use the DPDK C11 atomic wrappers instead.
+
+**GCC/Clang `__atomic_*` built-ins — do not use:**
+
+```c
+/* BAD - GCC built-in, not portable, not DPDK API */
+val = __atomic_load_n(&shared->count, __ATOMIC_RELAXED);
+__atomic_store_n(&shared->flag, 1, __ATOMIC_RELEASE);
+__atomic_fetch_add(&shared->counter, 1, __ATOMIC_RELAXED);
+__atomic_compare_exchange_n(&shared->state, &expected, desired,
+    0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+__atomic_thread_fence(__ATOMIC_SEQ_CST);
+
+/* GOOD - DPDK C11 atomic wrappers */
+val = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+rte_atomic_fetch_add_explicit(&shared->counter, 1, rte_memory_order_relaxed);
+rte_atomic_compare_exchange_strong_explicit(&shared->state, &expected, desired,
+    rte_memory_order_acq_rel, rte_memory_order_acquire);
+rte_atomic_thread_fence(rte_memory_order_seq_cst);
+```
+
+Similarly, do not use `__sync_*` built-ins (`__sync_fetch_and_add`,
+`__sync_bool_compare_and_swap`, etc.) — these are the older GCC
+atomics with implicit full barriers and are even less appropriate
+than `__atomic_*`.
+
+**Legacy DPDK barriers — do not use:**
+
+```c
+/* BAD - legacy DPDK barriers, deprecated */
+rte_smp_mb();            /* full memory barrier */
+rte_smp_rmb();           /* read memory barrier */
+rte_smp_wmb();           /* write memory barrier */
+
+/* GOOD - C11 fence with explicit ordering */
+rte_atomic_thread_fence(rte_memory_order_seq_cst);   /* replaces rte_smp_mb() */
+rte_atomic_thread_fence(rte_memory_order_acquire);    /* replaces rte_smp_rmb() */
+rte_atomic_thread_fence(rte_memory_order_release);    /* replaces rte_smp_wmb() */
+
+/* BETTER - use ordering on the atomic operation itself when possible */
+val = rte_atomic_load_explicit(&shared->flag, rte_memory_order_acquire);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+The legacy `rte_atomic16/32/64_*()` type-specific functions (e.g.,
+`rte_atomic32_inc()`, `rte_atomic64_read()`) are also deprecated.
+Use `rte_atomic_fetch_add_explicit()`, `rte_atomic_load_explicit()`,
+etc. with standard C integer types.
+
+| Deprecated API | Replacement |
+|----------------|-------------|
+| `__atomic_load_n()` | `rte_atomic_load_explicit()` |
+| `__atomic_store_n()` | `rte_atomic_store_explicit()` |
+| `__atomic_fetch_add()` | `rte_atomic_fetch_add_explicit()` |
+| `__atomic_compare_exchange_n()` | `rte_atomic_compare_exchange_strong_explicit()` |
+| `__atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+| `__ATOMIC_RELAXED` | `rte_memory_order_relaxed` |
+| `__ATOMIC_ACQUIRE` | `rte_memory_order_acquire` |
+| `__ATOMIC_RELEASE` | `rte_memory_order_release` |
+| `__ATOMIC_ACQ_REL` | `rte_memory_order_acq_rel` |
+| `__ATOMIC_SEQ_CST` | `rte_memory_order_seq_cst` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence(rte_memory_order_seq_cst)` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence(rte_memory_order_acquire)` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence(rte_memory_order_release)` |
+| `rte_atomic32_inc(&v)` | `rte_atomic_fetch_add_explicit(&v, 1, rte_memory_order_relaxed)` |
+| `rte_atomic64_read(&v)` | `rte_atomic_load_explicit(&v, rte_memory_order_relaxed)` |
+
+#### Memory Ordering Guide
+
+Use the weakest ordering that is correct. Stronger ordering
+constrains hardware and compiler optimization unnecessarily.
+
+| DPDK Ordering | When to Use |
+|---------------|-------------|
+| `rte_memory_order_relaxed` | Statistics counters, polling flags where no other data depends on the value. Most common for simple counters. |
+| `rte_memory_order_acquire` | **Load** side of a flag/pointer that guards access to other shared data. Ensures subsequent reads see data published by the releasing thread. |
+| `rte_memory_order_release` | **Store** side of a flag/pointer that publishes shared data. Ensures all prior writes are visible to a thread that does an acquire load. |
+| `rte_memory_order_acq_rel` | Read-modify-write operations (e.g., `fetch_add`) that both consume and publish shared state in one operation. |
+| `rte_memory_order_seq_cst` | Rarely needed. Only when multiple independent atomic variables must be observed in a globally consistent total order. Avoid unless required. |
+
+**Common pattern — producer/consumer flag:**
+
+```c
+/* Producer thread: fill buffer, then signal ready */
+fill_buffer(buf, data, len);
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+
+/* Consumer thread: wait for flag, then read buffer */
+while (!rte_atomic_load_explicit(&shared->ready, rte_memory_order_acquire))
+    rte_pause();
+process_buffer(buf, len);  /* guaranteed to see producer's writes */
+```
+
+**Common pattern — statistics counter (no ordering needed):**
+
+```c
+rte_atomic_fetch_add_explicit(&port_stats->rx_packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Standalone Fences
+
+Prefer ordering on the atomic operation itself (acquire load,
+release store) over standalone fences. Standalone fences
+(`rte_atomic_thread_fence()`) are a blunt instrument that
+orders ALL memory accesses around the fence, not just the
+atomic variable you care about.
+
+```c
+/* Acceptable but less precise - standalone fence */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_relaxed);
+rte_atomic_thread_fence(rte_memory_order_release);
+
+/* Preferred - ordering on the operation itself */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+Standalone fences are appropriate when synchronizing multiple
+non-atomic writes (e.g., filling a structure before publishing
+a pointer to it) where annotating each write individually is
+impractical.
+
+#### When volatile Is Still Acceptable
+
+`volatile` remains correct for:
+- Memory-mapped I/O registers (hardware MMIO)
+- Variables shared with signal handlers in single-threaded contexts
+- Interaction with `setjmp`/`longjmp`
+
+`volatile` is NOT correct for:
+- Any variable accessed by multiple threads
+- Polling flags between lcores
+- Statistics counters updated from multiple threads
+- Flags set by one thread and read by another
+
+**Do NOT flag** `volatile` used for MMIO or hardware register access
+(common in drivers under `drivers/*/base/`).
+
+### Threading
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `pthread_create()` | `rte_thread_create()` |
+| `pthread_join()` | `rte_thread_join()` |
+| `pthread_detach()` | EAL thread functions |
+| `pthread_setaffinity_np()` | `rte_thread_set_affinity()` |
+| `rte_thread_set_name()` | `rte_thread_set_prefixed_name()` |
+| `rte_thread_create_control()` | `rte_thread_create_internal_control()` |
+
+### Process-Shared Synchronization
+
+When placing synchronization primitives in shared memory (memory accessible by multiple processes, such as DPDK primary/secondary processes or `mmap`'d regions), they **must** be initialized with process-shared attributes. Failure to do so causes **undefined behavior** that may appear to work in testing but fail unpredictably in production.
+
+#### pthread Mutexes in Shared Memory
+
+**This is an error** - mutex in shared memory without `PTHREAD_PROCESS_SHARED`:
+
+```c
+/* BAD - undefined behavior when used across processes */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+void init_shared(struct shared_data *shm) {
+	pthread_mutex_init(&shm->lock, NULL);  /* ERROR: missing pshared attribute */
+}
+```
+
+**Correct implementation**:
+
+```c
+/* GOOD - properly initialized for cross-process use */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+int init_shared(struct shared_data *shm) {
+	pthread_mutexattr_t attr;
+	int ret;
+
+	ret = pthread_mutexattr_init(&attr);
+	if (ret != 0)
+		return -ret;
+
+	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+	if (ret != 0) {
+		pthread_mutexattr_destroy(&attr);
+		return -ret;
+	}
+
+	ret = pthread_mutex_init(&shm->lock, &attr);
+	pthread_mutexattr_destroy(&attr);
+
+	return -ret;
+}
+```
+
+#### pthread Condition Variables in Shared Memory
+
+Condition variables also require the process-shared attribute:
+
+```c
+/* BAD - will not work correctly across processes */
+pthread_cond_init(&shm->cond, NULL);
+
+/* GOOD */
+pthread_condattr_t cattr;
+pthread_condattr_init(&cattr);
+pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+pthread_cond_init(&shm->cond, &cattr);
+pthread_condattr_destroy(&cattr);
+```
+
+#### pthread Read-Write Locks in Shared Memory
+
+```c
+/* BAD */
+pthread_rwlock_init(&shm->rwlock, NULL);
+
+/* GOOD */
+pthread_rwlockattr_t rwattr;
+pthread_rwlockattr_init(&rwattr);
+pthread_rwlockattr_setpshared(&rwattr, PTHREAD_PROCESS_SHARED);
+pthread_rwlock_init(&shm->rwlock, &rwattr);
+pthread_rwlockattr_destroy(&rwattr);
+```
+
+#### When to Flag This Issue
+
+Flag as an **Error** when ALL of the following are true:
+1. A `pthread_mutex_t`, `pthread_cond_t`, `pthread_rwlock_t`, or `pthread_barrier_t` is initialized
+2. The primitive is stored in shared memory (identified by context such as: structure in `rte_malloc`/`rte_memzone`, `mmap`'d memory, memory passed to secondary processes, or structures documented as shared)
+3. The initialization uses `NULL` attributes or attributes without `PTHREAD_PROCESS_SHARED`
+
+**Do NOT flag** when:
+- The mutex is in thread-local or process-private heap memory (`malloc`)
+- The mutex is a local/static variable not in shared memory
+- The code already uses `pthread_mutexattr_setpshared()` with `PTHREAD_PROCESS_SHARED`
+- The synchronization uses DPDK primitives (`rte_spinlock_t`, `rte_rwlock_t`) which are designed for shared memory
+
+#### Preferred Alternatives
+
+For DPDK code, prefer DPDK's own synchronization primitives which are designed for shared memory:
+
+| pthread Primitive | DPDK Alternative |
+|-------------------|------------------|
+| `pthread_mutex_t` | `rte_spinlock_t` (busy-wait) or properly initialized pthread mutex |
+| `pthread_rwlock_t` | `rte_rwlock_t` |
+| `pthread_spinlock_t` | `rte_spinlock_t` |
+
+Note: `rte_spinlock_t` and `rte_rwlock_t` work correctly in shared memory without special initialization, but they are spinning locks unsuitable for long wait times.
+
+### Compiler Built-ins and Attributes
+
+| Forbidden | Preferred | Notes |
+|-----------|-----------|-------|
+| `__attribute__` | RTE macros in `rte_common.h` | Except in `lib/eal/include/rte_common.h` |
+| `__alignof__` | C11 `alignof` | |
+| `__typeof__` | `typeof` | |
+| `__builtin_*` | EAL macros | Except in `lib/eal/` and `drivers/*/base/` |
+| `__reserved` | Different name | Reserved in Windows headers |
+| `#pragma` / `_Pragma` | Avoid | Except in `rte_common.h` |
+
+### Format Specifiers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `%lld`, `%llu`, `%llx` | `%PRId64`, `%PRIu64`, `%PRIx64` |
+
+### Headers and Build
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `#include <linux/pci_regs.h>` | `#include <rte_pci.h>` | |
+| `install_headers()` | Meson `headers` variable | meson.build |
+| `-DALLOW_EXPERIMENTAL_API` | Not in lib/drivers/app | Build flags |
+| `allow_experimental_apis` | Not in lib/drivers/app | Meson |
+| `#undef XXX` | `// XXX is not set` | config/rte_config.h |
+| Driver headers (`*_driver.h`, `*_pmd.h`) | Public API headers | app/, examples/ |
+
+### Testing
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `REGISTER_TEST_COMMAND` | `REGISTER_<suite_name>_TEST` |
+
+### Documentation
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `http://...dpdk.org` | `https://...dpdk.org` |
+| `//doc.dpdk.org/guides/...` | `:ref:` or `:doc:` Sphinx references |
+| `::  file.svg` | `::  file.*` (wildcard extension) |
+
+---
+
+## Deprecated API Usage
+
+New patches must not introduce usage of deprecated APIs, macros, or functions.
+Deprecated items are marked with `RTE_DEPRECATED` or documented in the
+deprecation notices section of the release notes.
+
+### Rules for New Code
+
+- Do not call functions marked with `RTE_DEPRECATED` or `__rte_deprecated`
+- Do not use macros that have been superseded by newer alternatives
+- Do not use data structures or enum values marked as deprecated
+- Check `doc/guides/rel_notes/deprecation.rst` for planned deprecations
+- When a deprecated API has a replacement, use the replacement
+
+### Deprecating APIs
+
+A patch may mark an API as deprecated provided:
+
+- No remaining usages exist in the current DPDK codebase
+- The deprecation is documented in the release notes
+- A migration path or replacement API is documented
+- The `RTE_DEPRECATED` macro is used to generate compiler warnings
+
+```c
+/* Marking a function as deprecated */
+__rte_deprecated
+int
+rte_old_function(void);
+
+/* With a message pointing to the replacement */
+__rte_deprecated_msg("use rte_new_function() instead")
+int
+rte_old_function(void);
+```
+
+### Common Deprecated Patterns
+
+| Deprecated | Replacement | Notes |
+|-----------|-------------|-------|
+| `rte_atomic*_t` types | C11 atomics | Use `rte_atomic_xxx()` wrappers |
+| `rte_smp_*mb()` barriers | `rte_atomic_thread_fence()` | See Atomics section |
+| `pthread_*()` in portable code | `rte_thread_*()` | See Threading section |
+
+When reviewing patches that add new code, flag any usage of deprecated APIs
+as requiring change to use the modern replacement.
+
+---
+
+## API Tag Requirements
+
+### `__rte_experimental`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_experimental
+int
+rte_new_feature(void);
+
+/* Wrong - not alone on line */
+__rte_experimental int rte_new_feature(void);
+
+/* Wrong - in .c file */
+```
+
+### `__rte_internal`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_internal
+int
+internal_function(void);
+```
+
+### Alignment Attributes
+
+`__rte_aligned`, `__rte_cache_aligned`, `__rte_cache_min_aligned` may only be used with `struct` or `union` types:
+
+```c
+/* Correct */
+struct __rte_cache_aligned my_struct {
+	/* ... */
+};
+
+/* Wrong */
+int __rte_cache_aligned my_variable;
+```
+
+### Packed Attributes
+
+- `__rte_packed_begin` must follow `struct`, `union`, or alignment attributes
+- `__rte_packed_begin` and `__rte_packed_end` must be used in pairs
+- Cannot use `__rte_packed_begin` with `enum`
+
+```c
+/* Correct */
+struct __rte_packed_begin my_packed_struct {
+	/* ... */
+} __rte_packed_end;
+
+/* Wrong - with enum */
+enum __rte_packed_begin my_enum {
+	/* ... */
+};
+```
+
+---
+
+## Code Quality Requirements
+
+### Compilation
+
+- Each commit must compile independently (for `git bisect`)
+- No forward dependencies within a patchset
+- Test with multiple targets, compilers, and options
+- Use `devtools/test-meson-builds.sh`
+
+**Note for AI reviewers**: You cannot verify compilation order or cross-patch dependencies from patch review alone. Do NOT flag patches claiming they "would fail to compile" based on symbols used in other patches in the series. Assume the patch author has ordered them correctly.
+
+### Testing
+
+- Add tests to `app/test` unit test framework
+- New API functions must be used in `/app` test directory
+- New device APIs require at least one driver implementation
+
+#### Functional Test Infrastructure
+
+Standalone functional tests should use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure for consistency and proper integration with the DPDK test framework.
+
+```c
+#include <rte_test.h>
+
+static int
+test_feature_basic(void)
+{
+	int ret;
+
+	ret = rte_feature_init();
+	TEST_ASSERT_SUCCESS(ret, "Failed to initialize feature");
+
+	ret = rte_feature_operation();
+	TEST_ASSERT_EQUAL(ret, 0, "Operation returned unexpected value");
+
+	TEST_ASSERT_NOT_NULL(rte_feature_get_ptr(),
+		"Feature pointer should not be NULL");
+
+	return TEST_SUCCESS;
+}
+
+static struct unit_test_suite feature_testsuite = {
+	.suite_name = "feature_autotest",
+	.setup = test_feature_setup,
+	.teardown = test_feature_teardown,
+	.unit_test_cases = {
+		TEST_CASE(test_feature_basic),
+		TEST_CASE(test_feature_advanced),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_feature(void)
+{
+	return unit_test_suite_runner(&feature_testsuite);
+}
+
+REGISTER_FAST_TEST(feature_autotest, NOHUGE_OK, ASAN_OK, test_feature);
+```
+
+The `REGISTER_FAST_TEST` macro parameters are:
+- Test name (e.g., `feature_autotest`)
+- `NOHUGE_OK` or `HUGEPAGES_REQUIRED` - whether test can run without hugepages
+- `ASAN_OK` or `ASAN_FAILS` - whether test is compatible with Address Sanitizer
+- Test function name
+
+Common `TEST_ASSERT` macros:
+- `TEST_ASSERT(cond, msg, ...)` - Assert condition is true
+- `TEST_ASSERT_SUCCESS(val, msg, ...)` - Assert value equals 0
+- `TEST_ASSERT_FAIL(val, msg, ...)` - Assert value is non-zero
+- `TEST_ASSERT_EQUAL(a, b, msg, ...)` - Assert two values are equal
+- `TEST_ASSERT_NOT_EQUAL(a, b, msg, ...)` - Assert two values differ
+- `TEST_ASSERT_NULL(val, msg, ...)` - Assert value is NULL
+- `TEST_ASSERT_NOT_NULL(val, msg, ...)` - Assert value is not NULL
+
+### Documentation
+
+- Add Doxygen comments for public APIs
+- Update release notes in `doc/guides/rel_notes/` for important changes
+- Code and documentation must be updated atomically in same patch
+- Only update the **current release** notes file
+- Documentation must match the code
+- PMD features must match the features matrix in `doc/guides/nics/features/`
+- Documentation must match device operations (see `doc/guides/nics/features.rst` for the mapping between features, `eth_dev_ops`, and related APIs)
+- Release notes are NOT required for:
+  - Test-only changes (unit tests, functional tests)
+  - Internal APIs and helper functions (not exported to applications)
+  - Internal implementation changes that don't affect public API
+
+### RST Documentation Style
+
+When reviewing `.rst` documentation files, prefer **definition lists**
+over simple bullet lists where each item has a term and a description.
+Definition lists produce better-structured HTML/PDF output and are
+easier to scan.
+
+**When to suggest a definition list:**
+- A bullet list where each item starts with a bold or emphasized term
+  followed by a dash, colon, or long explanation
+- Lists of options, parameters, configuration values, or features
+  where each entry has a name and a description
+- Glossary-style enumerations
+
+**When a simple list is fine (do NOT flag):**
+- Short lists of items without descriptions (e.g., file names, steps)
+- Lists where items are single phrases or sentences with no term/definition structure
+- Enumerated steps in a procedure
+
+**RST definition list syntax:**
+
+```rst
+term 1
+   Description of term 1.
+
+term 2
+   Description of term 2.
+   Can span multiple lines.
+```
+
+**Example — flag this pattern:**
+
+```rst
+* **error** - Fail with error (default)
+* **truncate** - Truncate content to fit token limit
+* **summary** - Request high-level summary review
+```
+
+**Suggest rewriting as:**
+
+```rst
+error
+   Fail with error (default).
+
+truncate
+   Truncate content to fit token limit.
+
+summary
+   Request high-level summary review.
+```
+
+This is a **Warning**-level suggestion, not an Error. Do not flag it
+when the existing list structure is appropriate (see "when a simple
+list is fine" above).
+
+### API and Driver Changes
+
+- New APIs must be marked as `__rte_experimental`
+- New APIs must have hooks in `app/testpmd` and tests in the functional test suite
+- Changes to existing APIs require release notes
+- New drivers or subsystems must have release notes
+- Internal APIs (used only within DPDK, not exported to applications) do NOT require release notes
+
+### ABI Compatibility and Symbol Exports
+
+**IMPORTANT**: DPDK uses automatic symbol map generation. Do **NOT** recommend
+manually editing `version.map` files - they are auto-generated from source code
+annotations.
+
+#### Symbol Export Macros
+
+New public functions must be annotated with export macros (defined in
+`rte_export.h`). Place the macro on the line immediately before the function
+definition in the `.c` file:
+
+```c
+/* For stable ABI symbols */
+RTE_EXPORT_SYMBOL(rte_foo_create)
+int
+rte_foo_create(struct rte_foo_config *config)
+{
+    /* ... */
+}
+
+/* For experimental symbols (include version when first added) */
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_foo_new_feature, 25.03)
+__rte_experimental
+int
+rte_foo_new_feature(void)
+{
+    /* ... */
+}
+
+/* For internal symbols (shared between DPDK components only) */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_foo_internal_helper)
+int
+rte_foo_internal_helper(void)
+{
+    /* ... */
+}
+```
+
+#### Symbol Export Rules
+
+- `RTE_EXPORT_SYMBOL` - Use for stable ABI functions
+- `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, ver)` - Use for new experimental APIs
+  (version is the DPDK release, e.g., `25.03`)
+- `RTE_EXPORT_INTERNAL_SYMBOL` - Use for functions shared between DPDK libs/drivers
+  but not part of public API
+- Export macros go in `.c` files, not headers
+- The build system generates linker version maps automatically
+
+#### What NOT to Review
+
+- Do **NOT** flag missing `version.map` updates - maps are auto-generated
+- Do **NOT** suggest adding symbols to `lib/*/version.map` files
+
+#### ABI Versioning for Changed Functions
+
+When changing the signature of an existing stable function, use versioning macros
+from `rte_function_versioning.h`:
+
+- `RTE_VERSION_SYMBOL` - Create versioned symbol for backward compatibility
+- `RTE_DEFAULT_SYMBOL` - Mark the new default version
+
+Follow ABI policy and versioning guidelines in the contributor documentation.
+Enable ABI checks with `DPDK_ABI_REF_VERSION` environment variable.
+
+---
+
+## LTS (Long Term Stable) Release Review
+
+LTS releases are DPDK versions ending in `.11` (e.g., 23.11, 22.11,
+21.11, 20.11, 19.11). When reviewing patches targeting an LTS branch,
+apply stricter criteria:
+
+### LTS-Specific Rules
+
+- **Only bug fixes allowed** -- no new features
+- **No new APIs** (experimental or stable)
+- **ABI must remain unchanged** -- no symbol additions, removals,
+  or signature changes
+- Backported fixes should reference the original commit with a
+  `Fixes:` tag
+- Copyright years should reflect when the code was originally
+  written
+- Be conservative: reject changes that are not clearly bug fixes
+
+### What to Flag on LTS Branches
+
+**Error:**
+- New feature code (new functions, new driver capabilities)
+- New experimental or stable API additions
+- ABI changes (new or removed symbols, changed function signatures)
+- Changes that add new configuration options or parameters
+
+**Warning:**
+- Large refactoring that goes beyond what is needed for a fix
+- Missing `Fixes:` tag on a backported bug fix
+- Missing `Cc: stable@dpdk.org`
+
+### When LTS Rules Apply
+
+LTS rules apply when the reviewer is told the target release is an
+LTS version (via the `--release` option or equivalent). If no
+release is specified, assume the patch targets the main development
+branch where new features and APIs are allowed.
+
+---
+
+## Patch Validation Checklist
+
+### Commit Message and License
+
+Checked by `devtools/checkpatches.sh` -- not duplicated here.
+
+### Code Style
+
+- [ ] Lines <=100 characters
+- [ ] Hard tabs for indentation, spaces for alignment
+- [ ] No trailing whitespace
+- [ ] Proper include order
+- [ ] Header guards present
+- [ ] `rte_`/`RTE_` prefix on external symbols
+- [ ] Driver/library global variables use unique prefixes (e.g., `virtio_`, `mlx5_`)
+- [ ] No prohibited terminology
+- [ ] Proper brace style
+- [ ] Function return type on own line
+- [ ] Explicit comparisons: `== NULL`, `== 0`, `!= NULL`, `!= 0`
+- [ ] No forbidden tokens (see table above)
+- [ ] No unnecessary code patterns (see section above)
+- [ ] No usage of deprecated APIs, macros, or functions
+- [ ] Process-shared primitives in shared memory use `PTHREAD_PROCESS_SHARED`
+- [ ] `mmap()` return checked against `MAP_FAILED`, not `NULL`
+- [ ] Statistics use `+=` not `=` for accumulation
+- [ ] Integer multiplies widened before operation when result is 64-bit
+- [ ] Descriptor chain traversals bounded by ring size or loop counter
+- [ ] 64-bit bitmasks use `1ULL <<` or `RTE_BIT64()`, not `1 <<`
+- [ ] No unconditional variable overwrites before read
+- [ ] Nested loops use distinct counter variables
+- [ ] No `memcpy`/`memcmp` with identical source and destination pointers
+- [ ] `rte_mbuf_raw_free_bulk()` not used on mixed-pool mbuf arrays (Tx paths, ring dequeue, error paths)
+- [ ] MTU not confused with frame length (MTU = L3 payload, frame = MTU + L2 overhead)
+- [ ] PMDs read `dev->data->mtu` after configure, not `dev_conf.rxmode.mtu`
+- [ ] Ethernet overhead not hardcoded -- derived from device capabilities
+- [ ] Scatter Rx enabled or error returned when frame length exceeds single mbuf data size
+- [ ] `mtu_set` allows large MTU when scatter Rx is active; re-selects Rx burst function
+- [ ] Rx queue setup selects scattered Rx function when frame length exceeds mbuf
+- [ ] Static function pointer arrays declared `const` when contents are compile-time fixed
+- [ ] `bool` used for pure true/false variables, parameters, and predicate return types
+- [ ] Shared variables use `rte_atomic_*_explicit()`, not `volatile` or bare access
+- [ ] No `__atomic_*()` GCC built-ins or `__ATOMIC_*` ordering constants (use `rte_atomic_*_explicit()` and `rte_memory_order_*`)
+- [ ] No `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` (use `rte_atomic_thread_fence()`)
+- [ ] Memory ordering is the weakest correct choice (`relaxed` for counters, `acquire`/`release` for publish/consume)
+- [ ] Sensitive data cleared with `explicit_bzero()`/`rte_free_sensitive()`, not `memset()`
+
+### API Tags
+
+- [ ] `__rte_experimental` alone on line, only in headers
+- [ ] `__rte_internal` alone on line, only in headers
+- [ ] Alignment attributes only on struct/union
+- [ ] Packed attributes properly paired
+- [ ] New public functions have `RTE_EXPORT_*` macro in `.c` file
+- [ ] Experimental functions use `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, version)`
+
+### Structure
+
+- [ ] Each commit compiles independently
+- [ ] Code and docs updated together
+- [ ] Documentation matches code behavior
+- [ ] RST docs use definition lists for term/description patterns
+- [ ] PMD features match `doc/guides/nics/features/` matrix
+- [ ] Device operations match documentation (per `features.rst` mappings)
+- [ ] Tests added/updated as needed
+- [ ] Functional tests use TEST_ASSERT macros and unit_test_suite_runner
+- [ ] New APIs marked as `__rte_experimental`
+- [ ] New APIs have testpmd hooks and functional tests
+- [ ] Current release notes updated for significant changes
+- [ ] Release notes updated for API changes
+- [ ] Release notes updated for new drivers or subsystems
+
+---
+
+## Meson Build Files
+
+### Style Requirements
+
+- 4-space indentation (no tabs)
+- Line continuations double-indented
+- Lists alphabetically ordered
+- Short lists (<=3 items): single line, no trailing comma
+- Long lists: one item per line, trailing comma on last item
+- No strict line length limit for meson files; lines under 100 characters are acceptable
+
+```python
+# Short list
+sources = files('file1.c', 'file2.c')
+
+# Long list
+headers = files(
+	'header1.h',
+	'header2.h',
+	'header3.h',
+)
+```
+
+---
+
+## Python Code
+
+- Must comply with formatting standards
+- Use **`black`** for code formatting validation
+- Line length acceptable up to 100 characters
+
+---
+
+## Validation Tools
+
+Run these before submitting:
+
+```bash
+# Check commit messages
+devtools/check-git-log.sh -n1
+
+# Check patch format and forbidden tokens
+devtools/checkpatches.sh -n1
+
+# Check maintainers coverage
+devtools/check-maintainers.sh
+
+# Build validation
+devtools/test-meson-builds.sh
+
+# Find maintainers for your patch
+devtools/get-maintainer.sh <patch-file>
+```
+
+---
+
+## Severity Levels for AI Review
+
+**Error** (must fix):
+
+*Correctness bugs (highest value findings):*
+- Use-after-free
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference on reachable code path
+- Buffer overflow or out-of-bounds access
+- Missing error check on a function that can fail, leading to undefined behavior
+- Race condition on shared mutable state without synchronization
+- `volatile` used instead of atomics for inter-thread shared variables
+- `__atomic_*()` GCC built-ins in new code (must use `rte_atomic_*_explicit()`)
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` in new code (must use `rte_atomic_thread_fence()`)
+- Error path that skips necessary cleanup
+- `mmap()` return value checked against NULL instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=` (overwrite vs increment)
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied indices
+- `1 << n` used for 64-bit bitmask (undefined behavior if n >= 32)
+- Variable assigned then unconditionally overwritten before read
+- Same variable used as counter in nested loops
+- `memcpy`/`memcmp` with same pointer as both arguments (UB or no-op logic error)
+- `rte_mbuf_raw_free_bulk()` on mbuf array where mbufs may come from different pools (Tx burst, ring dequeue)
+- MTU used where frame length is needed or vice versa (off by L2 overhead)
+- `dev_conf.rxmode.mtu` read after configure instead of `dev->data->mtu` (stale value)
+- MTU accepted without scatter Rx when frame size exceeds single mbuf capacity (silent truncation/drop)
+- `mtu_set` rejects valid MTU when scatter Rx is already enabled
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size comparison
+
+*Process and format errors:*
+- Forbidden tokens in code
+- `__rte_experimental`/`__rte_internal` in .c files or not alone on line
+- Compilation failures
+- ABI breaks without proper versioning
+- pthread mutex/cond/rwlock in shared memory without `PTHREAD_PROCESS_SHARED`
+
+*API design errors (new libraries only):*
+- Ops/callback struct with 20+ function pointers in an installed header
+- Callback struct members with no Doxygen documentation
+- Void-returning callbacks for failable operations (errors silently swallowed)
+
+**Warning** (should fix):
+- Missing Cc: stable@dpdk.org for fixes
+- Documentation gaps
+- Documentation does not match code behavior
+- PMD features missing from `doc/guides/nics/features/` matrix
+- Device operations not documented per `features.rst` mappings
+- Missing tests
+- Functional tests not using TEST_ASSERT macros or unit_test_suite_runner
+- New API not marked as `__rte_experimental`
+- New API without testpmd hooks or functional tests
+- New public function missing `RTE_EXPORT_*` macro
+- API changes without release notes
+- New drivers or subsystems without release notes
+- Implicit comparisons (`!ptr` instead of `ptr == NULL`)
+- Unnecessary variable initialization
+- Unnecessary casts of `void *`
+- Unnecessary NULL checks before free
+- Inappropriate use of `rte_malloc()` or `rte_memcpy()`
+- Use of `perror()`, `printf()`, `fprintf()` in libraries or drivers (allowed in examples and test code)
+- Driver/library global variables without unique prefixes (static linking clash risk)
+- Usage of deprecated APIs, macros, or functions in new code
+- RST documentation using bullet lists where definition lists would be more appropriate
+- Ops/callback struct with >5 function pointers in an installed header (ABI risk)
+- New API using fixed enum+union where TLV pattern would be more extensible
+- Installed header labeled "private" or "internal" in meson.build
+- New library using global singleton instead of handle-based API
+- Static function pointer array not declared `const` when contents are compile-time constant
+- `int` used instead of `bool` for variables or return values that are purely true/false
+- `rte_memory_order_seq_cst` used where weaker ordering (`relaxed`, `acquire`/`release`) suffices
+- Standalone `rte_atomic_thread_fence()` where ordering on the atomic operation itself would be clearer
+- `getenv()` used in a driver or library for runtime configuration instead of devargs
+- Hardcoded Ethernet overhead constant instead of per-device overhead calculation
+- PMD does not advertise `RTE_ETH_RX_OFFLOAD_SCATTER` in `rx_offload_capa` but hardware supports multi-segment Rx
+- PMD `dev_info` reports `max_rx_pktlen` or `max_mtu` inconsistent with each other or with the Ethernet overhead
+- `mtu_set` callback does not re-select the Rx burst function after changing MTU
+
+**Do NOT flag** (common false positives):
+- Missing `version.map` updates (maps are auto-generated from `RTE_EXPORT_*` macros)
+- Suggesting manual edits to any `version.map` file
+- SPDX/copyright format, copyright years, copyright holders (not subject to AI review)
+- Commit message formatting (subject length, punctuation, tag order, case-sensitive terms) -- checked by checkpatch
+- Meson file lines under 100 characters
+- Comparisons using `== 0`, `!= 0`, `== NULL`, `!= NULL` as "implicit" (these ARE explicit)
+- Comparisons wrapped in `likely()` or `unlikely()` macros - these are still explicit if using == or !=
+- Anything you determine is correct (do not mention non-issues or say "No issue here")
+- `REGISTER_FAST_TEST` using `NOHUGE_OK`/`ASAN_OK` macros (this is the correct current format)
+- Missing release notes for test-only changes (unit tests do not require release notes)
+- Missing release notes for internal APIs or helper functions (only public APIs need release notes)
+- Any item you later correct with "(Correction: ...)" or "actually acceptable" - just omit it
+- Vague concerns ("should be verified", "should be checked") - if you're not sure it's wrong, don't flag it
+- Items where you say "which is correct" or "this is correct" - if it's correct, don't mention it at all
+- Items where you conclude "no issue here" or "this is actually correct" - omit these entirely
+- Clean patches in a series - do not include a patch just to say "no issues" or describe what it does
+- Cross-patch compilation dependencies - you cannot determine patch ordering correctness from review
+- Claims that a symbol "was removed in patch N" causing issues in patch M - assume author ordered correctly
+- Any speculation about whether patches will compile when applied in sequence
+- Mutexes/locks in process-private memory (standard `malloc`, stack, static non-shared) - these don't need `PTHREAD_PROCESS_SHARED`
+- Use of `rte_spinlock_t` or `rte_rwlock_t` in shared memory (these work correctly without special init)
+- `volatile` used for MMIO/hardware register access in drivers (this is correct usage)
+- `getenv()` used in EAL, examples, app/test, or build/config scripts (only flag in drivers/ and lib/)
+- Reading `rxmode.mtu` inside `rte_eth_dev_configure()` implementation (that is where the user request is consumed)
+- `=` assignment to MTU or frame length fields during initial setup (only flag stale reads of `rxmode.mtu` outside configure)
+- PMDs that auto-enable scatter when MTU exceeds mbuf size (this is the correct pattern)
+- Hardcoded `RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN` as overhead when the PMD does not support VLAN and device info is consistent
+
+**Info** (consider):
+- Minor style preferences
+- Optimization suggestions
+- Alternative approaches
+
+---
+
+# Response Format
+
+When you identify an issue:
+1. **State the problem** (1 sentence)
+2. **Why it matters** (1 sentence, only if not obvious)
+3. **Suggested fix** (code snippet or specific action)
+
+Example:
+This could panic if the string is NULL.
+
+---
+
+## FINAL CHECK BEFORE SUBMITTING REVIEW
+
+Before outputting your review, do two separate passes:
+
+### Pass 1: Verify correctness bugs are included
+
+Ask: "Did I trace every error path for resource leaks? Did I check
+for use-after-free? Did I verify error codes are propagated?"
+
+If you identified a potential correctness bug but talked yourself
+out of it, **add it back**. It is better to report a possible bug
+than to miss a real one.
+
+### Pass 2: Remove style/process false positives
+
+For EACH style/process item, ask: "Did I conclude this is actually
+fine/correct/acceptable/no issue?"
+
+If YES, DELETE THAT ITEM. It should not be in your output.
+
+An item that says "X is wrong... actually this is correct" is a
+FALSE POSITIVE and must be removed. This applies to style, format,
+and process items only.
+
+**If your Errors section would be empty after this check, that's
+fine -- it means the patches are good.**
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 2/6] devtools: add multi-provider AI patch review script
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
@ 2026-03-10  1:57     ` Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-10  1:57 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This is an AI generated script to review DPDK patches against
the AGENTS.md coding guidelines using AI language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script reads a patch file and the AGENTS.md guidelines, then
submits them to the selected AI provider for review. Results are
organized by severity level (Error, Warning, Info) as defined in
the guidelines.

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large patches
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/analyze-patch.py 0001-net-ixgbe-fix-something.patch
  ./devtools/analyze-patch.py -p xai my-patch.patch
  ./devtools/analyze-patch.py -l  # list providers

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/analyze-patch.py | 1348 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1348 insertions(+)
 create mode 100755 devtools/analyze-patch.py

diff --git a/devtools/analyze-patch.py b/devtools/analyze-patch.py
new file mode 100755
index 0000000000..4a2950d6a4
--- /dev/null
+++ b/devtools/analyze-patch.py
@@ -0,0 +1,1348 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Analyze DPDK patches using AI providers.
+
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from datetime import date
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any, Iterator
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Large file handling modes
+LARGE_FILE_MODES = ["error", "truncate", "chunk", "commits-only", "summary"]
+
+# Approximate tokens per character (conservative estimate for code)
+CHARS_PER_TOKEN = 3.5
+
+# Default token limits by provider (leaving room for system prompt and response)
+PROVIDER_INPUT_LIMITS = {
+    "anthropic": 180000,  # 200K context, reserve for system/response
+    "openai": 900000,  # GPT-4.1 has 1M context
+    "xai": 1800000,  # Grok 4.1 Fast has 2M context
+    "google": 900000,  # Gemini 3 Flash has 1M context
+}
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# LTS releases: any DPDK release with minor version .11
+# (e.g., 19.11, 20.11, 21.11, 22.11, 23.11, 24.11, 25.11, ...)
+
+SYSTEM_PROMPT_BASE = """\
+You are an expert DPDK code reviewer. Analyze patches for compliance with \
+DPDK coding standards and contribution guidelines. Provide clear, actionable \
+feedback organized by severity (Error, Warning, Info) as defined in the \
+guidelines."""
+
+LTS_RULES = """
+LTS (Long Term Stable) branch rules apply:
+- Only bug fixes allowed, no new features
+- No new APIs (experimental or stable)
+- ABI must remain unchanged
+- Backported fixes should reference the original commit with Fixes: tag
+- Copyright years should reflect when the code was originally written
+- Be conservative: reject changes that aren't clearly bug fixes"""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """Provide your review in plain text format.""",
+    "markdown": """Provide your review in Markdown format with:
+- Headers (##) for each severity level (Errors, Warnings, Info)
+- Bullet points for individual issues
+- Code blocks (```) for code references
+- Bold (**) for emphasis on key points""",
+    "html": """Provide your review in HTML format with:
+- <h2> tags for each severity level (Errors, Warnings, Info)
+- <ul>/<li> for individual issues
+- <pre><code> for code references
+- <strong> for emphasis on key points
+- Use appropriate semantic HTML tags
+- Do NOT include <html>, <head>, or <body> tags - just the content""",
+    "json": """Provide your review in JSON format with this structure:
+{
+  "summary": "Brief one-line summary of the review",
+  "errors": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "warnings": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "info": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "passed_checks": ["list of checks that passed"],
+  "overall_status": "PASS|WARN|FAIL"
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """Please review the following DPDK patch file '{patch_name}' \
+against the AGENTS.md guidelines. Focus on:
+
+1. Correctness bugs (resource leaks, use-after-free, race conditions, etc.)
+2. C coding style (forbidden tokens, implicit comparisons, unnecessary patterns)
+3. API and documentation requirements
+4. Any other guideline violations
+
+Note: commit message formatting and SPDX/copyright compliance are checked \
+by checkpatches.sh and should NOT be flagged here.
+
+{format_instruction}
+
+--- PATCH CONTENT ---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def is_lts_release(release: str | None) -> bool:
+    """Check if a release is an LTS release.
+
+    Per DPDK project guidelines, any release with minor version .11
+    is an LTS release (e.g., 19.11, 21.11, 23.11, 24.11, 25.11).
+    """
+    if not release:
+        return False
+    # Check for explicit -lts suffix
+    if "-lts" in release.lower():
+        return True
+    # Extract base version (e.g., "23.11" from "23.11.1" or "23.11-rc1")
+    version = release.split("-")[0]
+    parts = version.split(".")
+    if len(parts) >= 2:
+        try:
+            minor = int(parts[1])
+            return minor == 11
+        except ValueError:
+            pass
+    return False
+
+
+def estimate_tokens(text: str) -> int:
+    """Estimate token count from text length."""
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def split_mbox_patches(content: str) -> list[str]:
+    """Split an mbox file into individual patches."""
+    patches = []
+    current_patch = []
+    in_patch = False
+
+    for line in content.split("\n"):
+        # Detect start of new message in mbox format
+        if line.startswith("From ") and (
+            " Mon " in line
+            or " Tue " in line
+            or " Wed " in line
+            or " Thu " in line
+            or " Fri " in line
+            or " Sat " in line
+            or " Sun " in line
+        ):
+            if current_patch:
+                patches.append("\n".join(current_patch))
+            current_patch = [line]
+            in_patch = True
+        elif in_patch:
+            current_patch.append(line)
+
+    # Don't forget the last patch
+    if current_patch:
+        patches.append("\n".join(current_patch))
+
+    return patches if patches else [content]
+
+
+def extract_commit_messages(content: str) -> str:
+    """Extract only commit messages from patch content."""
+    patches = split_mbox_patches(content)
+    messages = []
+
+    for patch in patches:
+        lines = patch.split("\n")
+        msg_lines = []
+        in_headers = True
+        in_body = False
+        found_subject = False
+
+        for line in lines:
+            # Collect headers we care about
+            if in_headers:
+                if line.startswith("Subject:"):
+                    msg_lines.append(line)
+                    found_subject = True
+                elif line.startswith(("From:", "Date:")):
+                    msg_lines.append(line)
+                elif line.startswith((" ", "\t")) and found_subject:
+                    # Subject continuation
+                    msg_lines.append(line)
+                elif line == "":
+                    if found_subject:
+                        in_headers = False
+                        in_body = True
+                        msg_lines.append("")
+            elif in_body:
+                # Stop at the diff
+                if line.startswith("---") and not line.startswith("----"):
+                    break
+                if line.startswith("diff --git"):
+                    break
+                msg_lines.append(line)
+
+        if msg_lines:
+            messages.append("\n".join(msg_lines))
+
+    return "\n\n---\n\n".join(messages)
+
+
+def truncate_content(
+    content: str, max_tokens: float, provider: str
+) -> tuple[str, bool]:
+    """Truncate content to fit within token limit."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN)
+
+    if len(content) <= max_chars:
+        return content, False
+
+    # Try to truncate at a reasonable boundary
+    truncated = content[:max_chars]
+
+    # Find last complete diff hunk or patch boundary
+    last_diff = truncated.rfind("\ndiff --git")
+    last_patch = truncated.rfind("\nFrom ")
+
+    if last_diff > max_chars * 0.5:
+        truncated = truncated[:last_diff]
+    elif last_patch > max_chars * 0.5:
+        truncated = truncated[:last_patch]
+
+    truncated += "\n\n[... Content truncated due to size limits ...]\n"
+    return truncated, True
+
+
+def chunk_content(
+    content: str, max_tokens: int, provider: str
+) -> Iterator[tuple[str, int, int]]:
+    """Split content into chunks that fit within token limit.
+
+    Yields tuples of (chunk_content, chunk_number, total_chunks).
+    """
+    patches = split_mbox_patches(content)
+
+    if len(patches) == 1:
+        # Single large patch - split by diff sections
+        yield from chunk_single_patch(content, max_tokens)
+        return
+
+    # Multiple patches - group them to fit within limits
+    chunks = []
+    current_chunk = []
+    current_size = 0
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)  # 90% to leave margin
+
+    for patch in patches:
+        patch_size = len(patch)
+        if current_size + patch_size > max_chars and current_chunk:
+            chunks.append("\n".join(current_chunk))
+            current_chunk = []
+            current_size = 0
+
+        if patch_size > max_chars:
+            # Single patch too large, truncate it
+            if current_chunk:
+                chunks.append("\n".join(current_chunk))
+                current_chunk = []
+                current_size = 0
+            truncated, _ = truncate_content(patch, max_tokens * 0.9, provider)
+            chunks.append(truncated)
+        else:
+            current_chunk.append(patch)
+            current_size += patch_size
+
+    if current_chunk:
+        chunks.append("\n".join(current_chunk))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def chunk_single_patch(content: str, max_tokens: int) -> Iterator[tuple[str, int, int]]:
+    """Split a single large patch by diff sections."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)
+
+    # Extract header (everything before first diff)
+    first_diff = content.find("\ndiff --git")
+    if first_diff == -1:
+        # No diff sections, just truncate
+        truncated, _ = truncate_content(content, max_tokens * 0.9, "anthropic")
+        yield truncated, 1, 1
+        return
+
+    header = content[: first_diff + 1]
+    diff_content = content[first_diff + 1 :]
+
+    # Split by diff sections
+    diffs = []
+    current_diff = []
+    for line in diff_content.split("\n"):
+        if line.startswith("diff --git") and current_diff:
+            diffs.append("\n".join(current_diff))
+            current_diff = []
+        current_diff.append(line)
+    if current_diff:
+        diffs.append("\n".join(current_diff))
+
+    # Group diffs into chunks
+    chunks = []
+    current_chunk_diffs = []
+    current_size = len(header)
+
+    for diff in diffs:
+        diff_size = len(diff)
+        if current_size + diff_size > max_chars and current_chunk_diffs:
+            chunks.append(header + "\n".join(current_chunk_diffs))
+            current_chunk_diffs = []
+            current_size = len(header)
+
+        if diff_size + len(header) > max_chars:
+            # Single diff too large
+            if current_chunk_diffs:
+                chunks.append(header + "\n".join(current_chunk_diffs))
+                current_chunk_diffs = []
+            truncated_diff = diff[: max_chars - len(header) - 100]
+            truncated_diff += "\n[... diff truncated ...]\n"
+            chunks.append(header + truncated_diff)
+            current_size = len(header)
+        else:
+            current_chunk_diffs.append(diff)
+            current_size += diff_size
+
+    if current_chunk_diffs:
+        chunks.append(header + "\n".join(current_chunk_diffs))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def get_summary_prompt() -> str:
+    """Get prompt modifications for summary mode."""
+    return """
+NOTE: This is a LARGE patch series. Provide a HIGH-LEVEL summary review only:
+- Focus on overall architecture and design concerns
+- Check commit message formatting across the series
+- Identify any obvious policy violations
+- Do NOT attempt detailed line-by-line code review
+- Summarize the scope and purpose of the changes
+"""
+
+
+def format_combined_reviews(
+    reviews: list[tuple[str, str]], output_format: str, patch_name: str
+) -> str:
+    """Combine multiple chunk/patch reviews into a single output."""
+    if output_format == "json":
+        combined = {
+            "patch_file": patch_name,
+            "sections": [
+                {"label": label, "review": review} for label, review in reviews
+            ],
+        }
+        return json.dumps(combined, indent=2)
+    elif output_format == "html":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"<h2>{label}</h2>\n{review}")
+        return "\n<hr>\n".join(sections)
+    elif output_format == "markdown":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"## {label}\n\n{review}")
+        return "\n\n---\n\n".join(sections)
+    else:  # text
+        sections = []
+        for label, review in reviews:
+            sections.append(f"=== {label} ===\n\n{review}")
+        return "\n\n" + "=" * 60 + "\n\n".join(sections)
+
+
+def build_system_prompt(review_date: str, release: str | None) -> str:
+    """Build system prompt with date and release context."""
+    prompt = SYSTEM_PROMPT_BASE
+    prompt += f"\n\nCurrent date: {review_date}."
+
+    if release:
+        prompt += f"\nTarget DPDK release: {release}."
+        if is_lts_release(release):
+            prompt += LTS_RULES
+        else:
+            prompt += "\nThis is a main branch or standard release."
+            prompt += "\nNew features and experimental APIs are allowed."
+
+    return prompt
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": system_prompt},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": system_prompt},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": system_prompt}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + patch_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+    verbose: bool = False,
+) -> str:
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: {usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: {usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def get_last_message_id(patch_content: str) -> str | None:
+    """Extract Message-ID from the last patch in an mbox."""
+    msg_ids = re.findall(
+        r"^Message-I[Dd]:\s*(.+)$", patch_content, re.MULTILINE | re.IGNORECASE
+    )
+    if msg_ids:
+        msg_id = msg_ids[-1].strip()
+        # Normalize: remove < > and add them back
+        msg_id = msg_id.strip("<>")
+        return f"<{msg_id}>"
+    return None
+
+
+def get_last_subject(patch_content: str) -> str | None:
+    """Extract subject from the last patch in an mbox."""
+    # Find all Subject lines with potential continuations
+    subjects = []
+    lines = patch_content.split("\n")
+    i = 0
+    while i < len(lines):
+        if lines[i].lower().startswith("subject:"):
+            subject = lines[i][8:].strip()
+            i += 1
+            # Handle continuation lines
+            while i < len(lines) and lines[i].startswith((" ", "\t")):
+                subject += lines[i].strip()
+                i += 1
+            subjects.append(subject)
+        else:
+            i += 1
+    return subjects[-1] if subjects else None
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+) -> bool:
+    """Send review email using git send-email, sendmail, or msmtp."""
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    email_text = msg.as_string()
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(email_text, file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Write to temp file for git send-email
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".eml", delete=False) as f:
+        f.write(email_text)
+        temp_file = f.name
+
+    try:
+        # Try git send-email first
+        if get_git_config("sendemail.smtpserver"):
+            # Build command with all arguments
+            flat_cmd = ["git", "send-email", "--confirm=never", "--quiet"]
+            for addr in to_addrs:
+                flat_cmd.extend(["--to", addr])
+            for addr in cc_addrs:
+                flat_cmd.extend(["--cc", addr])
+            if from_addr:
+                flat_cmd.extend(["--from", from_addr])
+            if in_reply_to:
+                flat_cmd.extend(["--in-reply-to", in_reply_to])
+            flat_cmd.append(temp_file)
+
+            try:
+                subprocess.run(flat_cmd, check=True, capture_output=True)
+                print("Email sent via git send-email", file=sys.stderr)
+                return True
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+
+        # Try sendmail
+        try:
+            subprocess.run(
+                ["sendmail", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via sendmail", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        # Try msmtp
+        try:
+            subprocess.run(
+                ["msmtp", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via msmtp", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        error("Could not send email. Configure git send-email, sendmail, or msmtp.")
+
+    finally:
+        os.unlink(temp_file)
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Analyze DPDK patches using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s patch.patch                    # Review with default settings
+    %(prog)s -p openai my-patch.patch       # Use OpenAI ChatGPT
+    %(prog)s -f markdown patch.patch        # Output as Markdown
+    %(prog)s -f json -o review.json patch.patch  # Save JSON to file
+    %(prog)s -f html -o review.html patch.patch  # Save HTML to file
+    %(prog)s -r 24.11 patch.patch           # Review for specific release
+    %(prog)s -r 24.11-lts patch.patch       # Review for LTS branch
+    %(prog)s --send-email --to dev@dpdk.org series.mbox
+    %(prog)s --send-email --to dev@dpdk.org --dry-run series.mbox
+
+Large File Handling:
+    %(prog)s --split-patches series.mbox    # Review each patch separately
+    %(prog)s --split-patches --patch-range 1-5 series.mbox  # Review patches 1-5
+    %(prog)s --large-file=truncate patch.mbox   # Truncate to fit limit
+    %(prog)s --large-file=commits-only series.mbox  # Review commit messages only
+    %(prog)s --large-file=summary series.mbox   # High-level summary only
+    %(prog)s --large-file=chunk series.mbox     # Split and review in chunks
+
+Large File Modes:
+    error       - Fail with error (default)
+    truncate    - Truncate content to fit token limit
+    chunk       - Split into chunks and review each
+    commits-only - Extract and review only commit messages
+    summary     - Request high-level summary review
+
+LTS Releases:
+    Use -r/--release with LTS version (e.g., 24.11-lts, 23.11) to enable
+    stricter review rules: bug fixes only, no new features or APIs.
+    Any DPDK release with minor version .11 is an LTS release.
+        """,
+    )
+
+    parser.add_argument("patch_file", nargs="?", help="Patch file to analyze")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=4096,
+        help="Max tokens for response (default: 4096)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output",
+        metavar="FILE",
+        help="Write output to file instead of stdout",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Date and release options
+    parser.add_argument(
+        "-D",
+        "--date",
+        metavar="YYYY-MM-DD",
+        help="Review date context (default: today)",
+    )
+    parser.add_argument(
+        "-r",
+        "--release",
+        metavar="VERSION",
+        help="Target DPDK release (e.g., 24.11, 23.11-lts)",
+    )
+
+    # Large file handling options
+    large_group = parser.add_argument_group("Large File Handling")
+    large_group.add_argument(
+        "--large-file",
+        choices=LARGE_FILE_MODES,
+        default="error",
+        metavar="MODE",
+        help="How to handle large files: error (default), truncate, "
+        "chunk, commits-only, summary",
+    )
+    large_group.add_argument(
+        "--max-tokens",
+        type=int,
+        metavar="N",
+        help="Max input tokens (default: provider-specific)",
+    )
+    large_group.add_argument(
+        "--split-patches",
+        action="store_true",
+        help="Split mbox into individual patches and review each separately",
+    )
+    large_group.add_argument(
+        "--patch-range",
+        metavar="N-M",
+        help="Review only patches N through M (1-indexed, use with --split-patches)",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check patch file is provided
+    if not args.patch_file:
+        parser.error("patch_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    patch_path = Path(args.patch_file)
+    if not patch_path.exists():
+        error(f"Patch file not found: {args.patch_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Determine review date
+    review_date = args.date or date.today().isoformat()
+
+    # Build system prompt with date and release context
+    system_prompt = build_system_prompt(review_date, args.release)
+
+    # Read files
+    agents_content = agents_path.read_text()
+    patch_content = patch_path.read_text()
+    patch_name = patch_path.name
+
+    # Determine max tokens for this provider
+    max_input_tokens = args.max_tokens or PROVIDER_INPUT_LIMITS.get(
+        args.provider, 100000
+    )
+
+    # Estimate token count
+    estimated_tokens = estimate_tokens(patch_content + agents_content)
+
+    # Parse patch range if specified
+    patch_start, patch_end = None, None
+    if args.patch_range:
+        try:
+            if "-" in args.patch_range:
+                start, end = args.patch_range.split("-", 1)
+                patch_start = int(start)
+                patch_end = int(end)
+            else:
+                patch_start = patch_end = int(args.patch_range)
+        except ValueError:
+            error(f"Invalid --patch-range format: {args.patch_range}")
+
+    # Handle --split-patches mode
+    if args.split_patches:
+        patches = split_mbox_patches(patch_content)
+        total_patches = len(patches)
+
+        if total_patches == 1:
+            print(
+                "Note: Only 1 patch found in mbox, --split-patches has no effect",
+                file=sys.stderr,
+            )
+        else:
+            print(
+                f"Found {total_patches} patches in mbox",
+                file=sys.stderr,
+            )
+
+            # Apply patch range filter
+            if patch_start is not None:
+                if patch_start < 1 or patch_start > total_patches:
+                    error(
+                        f"Patch range start {patch_start} out of range (1-{total_patches})"
+                    )
+                if patch_end < patch_start or patch_end > total_patches:
+                    error(
+                        f"Patch range end {patch_end} out of range ({patch_start}-{total_patches})"
+                    )
+                patches = patches[patch_start - 1 : patch_end]
+                print(
+                    f"Reviewing patches {patch_start}-{patch_end} ({len(patches)} patches)",
+                    file=sys.stderr,
+                )
+
+            # Review each patch separately
+            all_reviews = []
+            for i, patch in enumerate(patches, patch_start or 1):
+                patch_label = f"Patch {i}/{total_patches}"
+                print(f"\nReviewing {patch_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    patch,
+                    f"{patch_name} ({patch_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((patch_label, review_text))
+
+            # Combine reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal API call
+            estimated_tokens = 0  # Bypass size check since we've already processed
+
+    # Check if content is too large
+    is_large = estimated_tokens > max_input_tokens
+
+    if is_large:
+        print(
+            f"Warning: Estimated {estimated_tokens:,} tokens exceeds limit of "
+            f"{max_input_tokens:,}",
+            file=sys.stderr,
+        )
+
+        if args.large_file == "error":
+            error(
+                f"Patch file too large ({estimated_tokens:,} tokens). "
+                f"Use --large-file=truncate|chunk|commits-only|summary to handle, "
+                f"or --split-patches to review patches individually."
+            )
+        elif args.large_file == "truncate":
+            print("Truncating content to fit token limit...", file=sys.stderr)
+            patch_content, was_truncated = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+            if was_truncated:
+                print("Content was truncated.", file=sys.stderr)
+        elif args.large_file == "commits-only":
+            print("Extracting commit messages only...", file=sys.stderr)
+            patch_content = extract_commit_messages(patch_content)
+            new_estimate = estimate_tokens(patch_content + agents_content)
+            print(
+                f"Reduced to ~{new_estimate:,} tokens (commit messages only)",
+                file=sys.stderr,
+            )
+            if new_estimate > max_input_tokens:
+                patch_content, _ = truncate_content(
+                    patch_content, max_input_tokens, args.provider
+                )
+        elif args.large_file == "summary":
+            print("Using summary mode for large patch...", file=sys.stderr)
+            system_prompt += get_summary_prompt()
+            patch_content, _ = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+        elif args.large_file == "chunk":
+            print("Processing in chunks...", file=sys.stderr)
+            all_reviews = []
+            for chunk, chunk_num, total_chunks in chunk_content(
+                patch_content, max_input_tokens, args.provider
+            ):
+                chunk_label = f"Chunk {chunk_num}/{total_chunks}"
+                print(f"Reviewing {chunk_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    chunk,
+                    f"{patch_name} ({chunk_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((chunk_label, review_text))
+
+            # Combine chunk reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal single API call below
+            estimated_tokens = 0
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Review date: {review_date}", file=sys.stderr)
+        if args.release:
+            lts_status = " (LTS)" if is_lts_release(args.release) else ""
+            print(f"Target release: {args.release}{lts_status}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Patch file: {args.patch_file}", file=sys.stderr)
+        print(f"Estimated tokens: {estimated_tokens:,}", file=sys.stderr)
+        print(f"Max input tokens: {max_input_tokens:,}", file=sys.stderr)
+        if args.large_file != "error":
+            print(f"Large file mode: {args.large_file}", file=sys.stderr)
+        if args.split_patches:
+            print("Split patches: yes", file=sys.stderr)
+        if args.output:
+            print(f"Output file: {args.output}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API (unless already processed via chunks/split)
+    if estimated_tokens > 0:  # Not already processed
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            args.output_format,
+            args.verbose,
+        )
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Format output based on requested format
+    provider_name = config["name"]
+
+    if args.output_format == "json":
+        # For JSON, try to parse and add metadata
+        try:
+            review_data = json.loads(review_text)
+        except json.JSONDecodeError:
+            # If AI didn't return valid JSON, wrap the text
+            review_data = {"raw_review": review_text}
+
+        output_data = {
+            "metadata": {
+                "patch_file": patch_name,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+                "review_date": review_date,
+                "target_release": args.release,
+                "is_lts": is_lts_release(args.release) if args.release else False,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+    elif args.output_format == "html":
+        # Wrap HTML content with header
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"<br>Target release: {args.release}{lts_badge}"
+        output_text = f"""<!-- AI-generated review of {patch_name} -->
+<!-- Reviewed using {provider_name} ({model}) on {review_date} -->
+<div class="patch-review">
+<h1>Patch Review: {patch_name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model}) on {review_date}{release_info}</p>
+{review_text}
+</div>
+"""
+    elif args.output_format == "markdown":
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"\n*Target release: {args.release}{lts_badge}*\n"
+        output_text = f"""# Patch Review: {patch_name}
+
+*Reviewed by {provider_name} ({model}) on {review_date}*
+{release_info}
+{review_text}
+"""
+    else:  # text
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+        output_text = f"=== Patch Review: {patch_name} (via {provider_name}) ===\n"
+        output_text += f"Review date: {review_date}\n"
+        output_text += release_info
+        output_text += "\n" + review_text
+
+    # Write output
+    if args.output:
+        Path(args.output).write_text(output_text)
+        print(f"Review written to: {args.output}", file=sys.stderr)
+    else:
+        print(output_text)
+
+    # Send email if requested
+    if args.send_email:
+        # Email always uses plain text - warn if different format requested
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        in_reply_to = get_last_message_id(patch_content)
+        orig_subject = get_last_subject(patch_content)
+
+        if orig_subject:
+            # Remove [PATCH n/m] prefix
+            review_subject = re.sub(r"^\[PATCH[^\]]*\]\s*", "", orig_subject)
+            review_subject = f"[REVIEW] {review_subject}"
+        else:
+            review_subject = f"[REVIEW] {patch_name}"
+
+        # Build email body - always use plain text version
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+
+        email_body = f"""AI-generated review of {patch_name}
+Reviewed using {provider_name} ({model}) on {review_date}
+{release_info}
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print(f"In-Reply-To: {in_reply_to}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            in_reply_to,
+            email_body,
+            args.dry_run,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 3/6] devtools: add compare-reviews.sh for multi-provider analysis
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-03-10  1:57     ` Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-10  1:57 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 192 ++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100755 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100755
index 0000000000..a63eeffb71
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,192 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -e
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+VERBOSE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            FORMAT="$2"
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"
+        echo ""
+        echo "Saved to: $OUTPUT_FILE"
+    else
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE"
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 4/6] devtools: add multi-provider AI documentation review script
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
                       ` (2 preceding siblings ...)
  2026-03-10  1:57     ` [PATCH v10 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
@ 2026-03-10  1:57     ` Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-10  1:57 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add review-doc.py script that reviews DPDK documentation files for
spelling, grammar, technical correctness, and clarity using AI
language models. Supports batch processing of multiple files.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

Output formats (-f/--format):
  - text: plain text with extractable diff/msg markers (default)
  - markdown: formatted review document
  - html: complete HTML document with styling
  - json: structured data with metadata

For each input file, the script produces:
  - <basename>.{txt,md,html,json}: review in selected format
  - <basename>.diff: unified diff (text/json, or with -d flag)
  - <basename>.msg: commit message (text/json, or with -d flag)

The commit message prefix is automatically determined from the
file path (e.g., doc/guides/prog_guide: for programmer's guide).

Features:
  - Multiple file processing with glob support
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Configurable output directory via -o/--output-dir option
  - Output format selection via -f/--format option
  - Force diff/msg generation via -d/--diff option
  - Quiet mode (-q) suppresses stdout output
  - Verbose mode (-v) shows token usage and API details
  - Email integration using git sendemail configuration
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/review-doc.py doc/guides/prog_guide/mempool_lib.rst
  ./devtools/review-doc.py doc/guides/nics/*.rst
  ./devtools/review-doc.py -f html -d -o /tmp doc/guides/nics/*.rst
  ./devtools/review-doc.py --send-email --to dev@dpdk.org file.rst

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/review-doc.py | 1099 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1099 insertions(+)
 create mode 100755 devtools/review-doc.py

diff --git a/devtools/review-doc.py b/devtools/review-doc.py
new file mode 100755
index 0000000000..c8a1988a10
--- /dev/null
+++ b/devtools/review-doc.py
@@ -0,0 +1,1099 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Review DPDK documentation files using AI providers.
+
+Produces a diff file and commit message compliant with DPDK standards.
+Accepts multiple documentation files and generates output for each.
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import getpass
+import json
+import os
+import re
+import smtplib
+import ssl
+import subprocess
+import sys
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Map output format to file extension
+FORMAT_EXTENSIONS = {
+    "text": ".txt",
+    "markdown": ".md",
+    "html": ".html",
+    "json": ".json",
+}
+
+# Additional markers for extracting diff/msg (used with --diff flag)
+DIFF_MARKERS_INSTRUCTION = """
+
+ADDITIONALLY, at the end of your response, include these exact markers for automated extraction:
+---COMMIT_MESSAGE_START---
+(same commit message as above)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(same unified diff as above)
+---UNIFIED_DIFF_END---
+"""
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# Commit prefix mappings based on file path
+COMMIT_PREFIX_MAP = [
+    ("doc/guides/prog_guide/", "doc/guides/prog_guide:"),
+    ("doc/guides/sample_app_ug/", "doc/guides/sample_app:"),
+    ("doc/guides/nics/", "doc/guides/nics:"),
+    ("doc/guides/cryptodevs/", "doc/guides/cryptodevs:"),
+    ("doc/guides/compressdevs/", "doc/guides/compressdevs:"),
+    ("doc/guides/eventdevs/", "doc/guides/eventdevs:"),
+    ("doc/guides/rawdevs/", "doc/guides/rawdevs:"),
+    ("doc/guides/bbdevs/", "doc/guides/bbdevs:"),
+    ("doc/guides/gpus/", "doc/guides/gpus:"),
+    ("doc/guides/dmadevs/", "doc/guides/dmadevs:"),
+    ("doc/guides/regexdevs/", "doc/guides/regexdevs:"),
+    ("doc/guides/mldevs/", "doc/guides/mldevs:"),
+    ("doc/guides/rel_notes/", "doc/guides/rel_notes:"),
+    ("doc/guides/linux_gsg/", "doc/guides/linux_gsg:"),
+    ("doc/guides/freebsd_gsg/", "doc/guides/freebsd_gsg:"),
+    ("doc/guides/windows_gsg/", "doc/guides/windows_gsg:"),
+    ("doc/guides/tools/", "doc/guides/tools:"),
+    ("doc/guides/testpmd_app_ug/", "doc/guides/testpmd:"),
+    ("doc/guides/howto/", "doc/guides/howto:"),
+    ("doc/guides/contributing/", "doc/guides/contributing:"),
+    ("doc/guides/platform/", "doc/guides/platform:"),
+    ("doc/guides/", "doc:"),
+    ("doc/api/", "doc/api:"),
+    ("doc/", "doc:"),
+]
+
+SYSTEM_PROMPT = """\
+You are an expert technical documentation reviewer for DPDK.
+Your task is to review documentation files and suggest improvements for:
+- Spelling errors
+- Grammar issues
+- Technical correctness
+- Clarity and readability
+- Consistency with DPDK terminology
+
+IMPORTANT COMMIT MESSAGE RULES (from check-git-log.sh):
+- Subject line MUST be ≤60 characters
+- Format: "prefix: lowercase description"
+- First word after colon must be lowercase (except acronyms like Rx, Tx, VF, MAC, API)
+- Use imperative mood (e.g., "fix typo" not "fixed typo" or "fixes typo")
+- NO trailing period on subject line
+- NO punctuation marks: , ; ! ? & |
+- NO underscores in subject after colon
+- Body lines wrapped at 75 characters
+- Body must NOT start with "It"
+- Do NOT include Signed-off-by (user adds via git commit --sign)
+- Only use "Fixes:" tag for actual errors in documentation, not style improvements
+
+Case-sensitive terms (must use exact case):
+- Rx, Tx (not RX, TX, rx, tx)
+- VF, PF (not vf, pf)
+- MAC, VLAN, RSS, API
+- Linux, Windows, FreeBSD
+
+For style/clarity improvements, do NOT use Fixes tag.
+For actual errors (wrong information, broken examples), include Fixes tag \
+if you can identify the commit."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """
+OUTPUT FORMAT:
+You must output exactly two sections:
+
+1. COMMIT_MESSAGE section containing the complete commit message
+2. UNIFIED_DIFF section containing the unified diff
+
+Use these exact markers:
+---COMMIT_MESSAGE_START---
+(commit message here)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(unified diff here)
+---UNIFIED_DIFF_END---
+
+The diff should be in unified format that can be applied with "git apply".
+If no changes are needed, output empty sections with a note.""",
+    "markdown": """
+OUTPUT FORMAT:
+Provide your review in Markdown format with:
+
+## Summary
+Brief description of changes
+
+## Commit Message
+```
+(complete commit message here, ready to use)
+```
+
+## Changes
+For each change:
+### Issue N: Brief title
+- **Location**: file path and line
+- **Problem**: description
+- **Fix**: suggested correction
+
+## Unified Diff
+```diff
+(unified diff here)
+```""",
+    "html": """
+OUTPUT FORMAT:
+Provide your review in HTML format with:
+- <h2> for sections (Summary, Commit Message, Changes, Diff)
+- <pre><code> for commit message and diff
+- <ul>/<li> for individual issues
+- Do NOT include <html>, <head>, or <body> tags - just the content
+
+Include sections for: Summary, Commit Message, Changes, Unified Diff""",
+    "json": """
+OUTPUT FORMAT:
+Provide your review as JSON with this structure:
+{
+  "summary": "Brief description of changes",
+  "commit_message": "Complete commit message ready to use",
+  "changes": [
+    {
+      "type": "spelling|grammar|technical|clarity|style",
+      "location": "line number or section",
+      "original": "original text",
+      "suggested": "corrected text",
+      "reason": "why this change"
+    }
+  ],
+  "diff": "unified diff as a string",
+  "stats": {
+    "total_issues": 0,
+    "spelling": 0,
+    "grammar": 0,
+    "technical": 0,
+    "clarity": 0
+  }
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """\
+Review the following DPDK documentation file and provide improvements.
+
+File path: {doc_file}
+Commit message prefix to use: {commit_prefix}
+
+{format_instruction}
+
+---DOCUMENT CONTENT---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_smtp_config() -> dict[str, Any]:
+    """Get SMTP configuration from git config sendemail settings."""
+    config = {
+        "server": get_git_config("sendemail.smtpserver"),
+        "port": get_git_config("sendemail.smtpserverport"),
+        "user": get_git_config("sendemail.smtpuser"),
+        "encryption": get_git_config("sendemail.smtpencryption"),
+        "password": get_git_config("sendemail.smtppass"),
+    }
+
+    # Set defaults
+    if not config["port"]:
+        if config["encryption"] == "ssl":
+            config["port"] = "465"
+        else:
+            config["port"] = "587"
+
+    # Convert port to int
+    if config["port"]:
+        config["port"] = int(config["port"])
+
+    return config
+
+
+def get_commit_prefix(filepath: str) -> str:
+    """Determine commit message prefix from file path."""
+    for prefix_path, prefix in COMMIT_PREFIX_MAP:
+        if filepath.startswith(prefix_path):
+            return prefix
+    return "doc:"
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": SYSTEM_PROMPT}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + doc_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+    verbose: bool = False,
+) -> str:
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: " f"{usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: " f"{usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def parse_review_text(review_text: str) -> tuple[str, str]:
+    """Extract commit message and diff from text format response."""
+    commit_msg = ""
+    diff = ""
+
+    # Extract commit message
+    msg_match = re.search(
+        r"---COMMIT_MESSAGE_START---\s*\n(.*?)\n---COMMIT_MESSAGE_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if msg_match:
+        commit_msg = msg_match.group(1).strip()
+
+    # Extract unified diff
+    diff_match = re.search(
+        r"---UNIFIED_DIFF_START---\s*\n(.*?)\n---UNIFIED_DIFF_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if diff_match:
+        diff = diff_match.group(1).strip()
+        # Clean up any markdown code fence if present
+        diff = re.sub(r"^```diff\s*\n?", "", diff)
+        diff = re.sub(r"\n?```\s*$", "", diff)
+
+    return commit_msg, diff
+
+
+def strip_diff_markers(text: str) -> str:
+    """Remove the diff/msg extraction markers from text."""
+    # Remove commit message markers and content
+    text = re.sub(
+        r"\n*---COMMIT_MESSAGE_START---\s*\n.*?\n---COMMIT_MESSAGE_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    # Remove unified diff markers and content
+    text = re.sub(
+        r"\n*---UNIFIED_DIFF_START---\s*\n.*?\n---UNIFIED_DIFF_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    return text.strip()
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+    verbose: bool = False,
+) -> bool:
+    """Send review email via SMTP using git sendemail config."""
+    # Build email message
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(msg.as_string(), file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Get SMTP configuration from git config
+    smtp_config = get_smtp_config()
+
+    if not smtp_config["server"]:
+        error("No SMTP server configured. Set git config sendemail.smtpserver")
+
+    server = smtp_config["server"]
+    port = smtp_config["port"]
+    user = smtp_config["user"]
+    encryption = smtp_config["encryption"]
+
+    # Get password from environment or git config, or prompt
+    password = os.environ.get("SMTP_PASSWORD") or smtp_config["password"]
+    if user and not password:
+        password = getpass.getpass(f"SMTP password for {user}@{server}: ")
+
+    if verbose:
+        print(f"SMTP server: {server}:{port}", file=sys.stderr)
+        print(f"SMTP user: {user or '(none)'}", file=sys.stderr)
+        print(f"Encryption: {encryption or 'starttls'}", file=sys.stderr)
+
+    # Collect all recipients
+    all_recipients = list(to_addrs)
+    if cc_addrs:
+        all_recipients.extend(cc_addrs)
+
+    try:
+        if encryption == "ssl":
+            # SSL/TLS connection from the start (port 465)
+            context = ssl.create_default_context()
+            with smtplib.SMTP_SSL(server, port, context=context) as smtp:
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+        else:
+            # STARTTLS (port 587) or plain (port 25)
+            with smtplib.SMTP(server, port) as smtp:
+                smtp.ehlo()
+                if encryption == "tls" or port == 587:
+                    context = ssl.create_default_context()
+                    smtp.starttls(context=context)
+                    smtp.ehlo()
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+
+        print(f"Email sent via SMTP ({server}:{port})", file=sys.stderr)
+        return True
+
+    except smtplib.SMTPAuthenticationError as e:
+        error(f"SMTP authentication failed: {e}")
+    except smtplib.SMTPException as e:
+        error(f"SMTP error: {e}")
+    except OSError as e:
+        error(f"Connection error to {server}:{port}: {e}")
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Review DPDK documentation files using AI providers. "
+        "Accepts multiple files and generates output for each.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s doc/guides/prog_guide/mempool_lib.rst
+    %(prog)s doc/guides/nics/*.rst              # Review all NIC docs
+    %(prog)s -p openai -o /tmp doc/guides/nics/ixgbe.rst doc/guides/nics/i40e.rst
+    %(prog)s -f html -d -o /tmp/reviews doc/guides/nics/*.rst  # HTML + diff files
+    %(prog)s -f json -o /tmp doc/guides/howto/flow_bifurcation.rst
+    %(prog)s --send-email --to dev@dpdk.org doc/guides/nics/ixgbe.rst
+
+Output files (in output-dir):
+    <basename>.txt|.md|.html|.json  Review in selected format
+    <basename>.diff                  Unified diff (text/json, or with --diff)
+    <basename>.msg                   Commit message (text/json, or with --diff)
+
+After review:
+    git apply <basename>.diff
+    git commit -sF <basename>.msg
+
+SMTP Configuration (from git config):
+    sendemail.smtpserver      SMTP server hostname
+    sendemail.smtpserverport  SMTP port (default: 587 for TLS, 465 for SSL)
+    sendemail.smtpuser        SMTP username
+    sendemail.smtpencryption  'tls' for STARTTLS, 'ssl' for SSL/TLS
+    sendemail.smtppass        SMTP password (or set SMTP_PASSWORD env var)
+
+Example git config:
+    git config --global sendemail.smtpserver smtp.gmail.com
+    git config --global sendemail.smtpserverport 587
+    git config --global sendemail.smtpuser yourname@gmail.com
+    git config --global sendemail.smtpencryption tls
+        """,
+    )
+
+    parser.add_argument(
+        "doc_files",
+        nargs="+",
+        metavar="doc_file",
+        help="Documentation file(s) to review",
+    )
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=8192,
+        help="Max tokens for response (default: 8192)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output-dir",
+        default=".",
+        help="Output directory for all output files (default: .)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-q",
+        "--quiet",
+        action="store_true",
+        help="Suppress review output to stdout (only write files)",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-d",
+        "--diff",
+        action="store_true",
+        help="Always produce .diff and .msg files (automatic for text/json)",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    # Validate all doc files exist before processing
+    doc_paths = []
+    for doc_file in args.doc_files:
+        doc_path = Path(doc_file)
+        if not doc_path.exists():
+            error(f"Documentation file not found: {doc_file}")
+        doc_paths.append((doc_file, doc_path))
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Read AGENTS.md once
+    agents_content = agents_path.read_text()
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    provider_name = config["name"]
+
+    # Process each file
+    num_files = len(doc_paths)
+    for file_idx, (doc_file, doc_path) in enumerate(doc_paths, 1):
+        if num_files > 1:
+            print(
+                f"\n{'=' * 60}",
+                file=sys.stderr,
+            )
+            print(
+                f"Processing file {file_idx}/{num_files}: {doc_file}",
+                file=sys.stderr,
+            )
+            print(
+                f"{'=' * 60}",
+                file=sys.stderr,
+            )
+
+        # Determine output filenames
+        doc_basename = doc_path.stem
+        diff_file = output_dir / f"{doc_basename}.diff"
+        msg_file = output_dir / f"{doc_basename}.msg"
+
+        # Get commit prefix
+        commit_prefix = get_commit_prefix(doc_file)
+
+        # Read doc content
+        doc_content = doc_path.read_text()
+
+        if args.verbose:
+            print("=== Request ===", file=sys.stderr)
+            print(f"Provider: {args.provider}", file=sys.stderr)
+            print(f"Model: {model}", file=sys.stderr)
+            print(f"Output format: {args.output_format}", file=sys.stderr)
+            print(f"AGENTS file: {args.agents}", file=sys.stderr)
+            print(f"Doc file: {doc_file}", file=sys.stderr)
+            print(f"Commit prefix: {commit_prefix}", file=sys.stderr)
+            print(f"Output dir: {args.output_dir}", file=sys.stderr)
+            if args.send_email:
+                print("Send email: yes", file=sys.stderr)
+                print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+                if args.cc_addrs:
+                    print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+                print(f"From: {from_addr}", file=sys.stderr)
+            print("===============", file=sys.stderr)
+
+        # Call API
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            args.output_format,
+            args.diff,
+            args.verbose,
+        )
+
+        if not review_text:
+            print(
+                f"Warning: No response received for {doc_file}",
+                file=sys.stderr,
+            )
+            continue
+
+        # Determine review output file
+        format_ext = FORMAT_EXTENSIONS[args.output_format]
+        review_file = output_dir / f"{doc_basename}{format_ext}"
+
+        # Determine if we should write diff/msg files
+        write_diff_msg = args.diff or args.output_format in ("text", "json")
+
+        # Extract commit message and diff first (before stripping markers)
+        commit_msg, diff = "", ""
+        if write_diff_msg:
+            if args.output_format == "json":
+                # Will extract from JSON below
+                pass
+            else:
+                # Parse from text format markers
+                commit_msg, diff = parse_review_text(review_text)
+
+        # For non-text formats with --diff, strip the markers from display output
+        display_text = review_text
+        if args.diff and args.output_format in ("markdown", "html"):
+            display_text = strip_diff_markers(review_text)
+
+        # Build formatted output text
+        if args.output_format == "text":
+            output_text = review_text
+        elif args.output_format == "json":
+            # Try to parse JSON response
+            try:
+                review_data = json.loads(review_text)
+            except json.JSONDecodeError:
+                print("Warning: Response is not valid JSON", file=sys.stderr)
+                review_data = {"raw_response": review_text}
+
+            # Extract diff/msg from JSON if present
+            if write_diff_msg:
+                if isinstance(review_data, dict) and "raw_response" not in review_data:
+                    commit_msg = review_data.get("commit_message", "")
+                    diff = review_data.get("diff", "")
+
+            # Add metadata
+            output_data = {
+                "metadata": {
+                    "doc_file": doc_file,
+                    "provider": args.provider,
+                    "provider_name": provider_name,
+                    "model": model,
+                    "commit_prefix": commit_prefix,
+                },
+                "review": review_data,
+            }
+            output_text = json.dumps(output_data, indent=2)
+        elif args.output_format == "markdown":
+            output_text = f"""# Documentation Review: {doc_path.name}
+
+*Reviewed by {provider_name} ({model})*
+
+{display_text}
+"""
+        elif args.output_format == "html":
+            output_text = f"""<!DOCTYPE html>
+<html>
+<head>
+<meta charset="utf-8">
+<title>Review: {doc_path.name}</title>
+<style>
+body {{ font-family: system-ui, sans-serif; max-width: 900px; margin: 2em auto; padding: 0 1em; }}
+h1 {{ color: #333; }}
+.review-meta {{ color: #666; font-style: italic; }}
+pre {{ background: #f5f5f5; padding: 1em; overflow-x: auto; }}
+</style>
+</head>
+<body>
+<h1>Documentation Review: {doc_path.name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+<div class="review-content">
+{display_text}
+</div>
+</body>
+</html>
+"""
+
+        # Write formatted review to file
+        review_file.write_text(output_text)
+        print(f"Review written to: {review_file}", file=sys.stderr)
+
+        # Write diff/msg files
+        if write_diff_msg:
+            if commit_msg:
+                msg_file.write_text(commit_msg + "\n")
+                print(f"Commit message written to: {msg_file}", file=sys.stderr)
+            else:
+                msg_file.write_text("# No commit message generated\n")
+                print("Warning: Could not extract commit message", file=sys.stderr)
+
+            if diff:
+                diff_file.write_text(diff + "\n")
+                print(f"Diff written to: {diff_file}", file=sys.stderr)
+            else:
+                diff_file.write_text("# No changes suggested\n")
+                print("Warning: Could not extract diff", file=sys.stderr)
+
+        # Print to stdout unless quiet (or multiple files without verbose)
+        show_stdout = not args.quiet and (num_files == 1 or args.verbose)
+        if show_stdout:
+            print(
+                f"\n=== Documentation Review: {doc_path.name} "
+                f"(via {provider_name}) ==="
+            )
+            print(output_text)
+
+            # Print usage instructions for text format
+            if args.output_format == "text":
+                print("\n=== Output Files ===")
+                print(f"Commit message: {msg_file}")
+                print(f"Diff file:      {diff_file}")
+                print("\nTo apply changes:")
+                print(f"  git apply {diff_file}")
+                print(f"  git commit -sF {msg_file}")
+
+        # Send email if requested
+        if args.send_email:
+            if args.output_format != "text":
+                print(
+                    f"Note: Email will be sent as plain text regardless of "
+                    f"--format={args.output_format}",
+                    file=sys.stderr,
+                )
+
+            review_subject = f"[REVIEW] {commit_prefix} {doc_path.name}"
+
+            # Build email body
+            email_body = f"""AI-generated documentation review of {doc_file}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+            if args.verbose:
+                print("", file=sys.stderr)
+                print("=== Email Details ===", file=sys.stderr)
+                print(f"Subject: {review_subject}", file=sys.stderr)
+                print("=====================", file=sys.stderr)
+
+            send_email(
+                args.to_addrs,
+                args.cc_addrs,
+                from_addr,
+                review_subject,
+                None,
+                email_body,
+                args.dry_run,
+                args.verbose,
+            )
+
+            if not args.dry_run:
+                print("", file=sys.stderr)
+                print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+    # Print summary for multiple files
+    if num_files > 1:
+        print(f"\n{'=' * 60}", file=sys.stderr)
+        print(f"Processed {num_files} files", file=sys.stderr)
+        print(f"Output directory: {output_dir}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 5/6] doc: add AI-assisted patch review to contributing guide
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
                       ` (3 preceding siblings ...)
  2026-03-10  1:57     ` [PATCH v10 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-03-10  1:57     ` Stephen Hemminger
  2026-03-10  1:57     ` [PATCH v10 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-10  1:57 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a new section to the contributing guide describing the
analyze-patch.py script which uses AI providers to review patches
against DPDK coding standards before submission to the mailing list.

The new section covers basic usage, provider selection, patch series
handling, LTS release review, and output format options. A note
clarifies that AI review supplements but does not replace human
review.

Also add a reference to the script in the new driver guide's
test tools checklist.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/contributing/new_driver.rst |  2 +
 doc/guides/contributing/patches.rst    | 59 ++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/guides/contributing/new_driver.rst b/doc/guides/contributing/new_driver.rst
index 555e875329..6c0d356cfd 100644
--- a/doc/guides/contributing/new_driver.rst
+++ b/doc/guides/contributing/new_driver.rst
@@ -210,3 +210,5 @@ Be sure to run the following test tools per patch in a patch series:
 * `check-doc-vs-code.sh`
 * `check-spdx-tag.sh`
 * Build documentation and validate how output looks
+* Optionally run ``analyze-patch.py`` for AI-assisted review
+  (see :ref:`ai_assisted_review` in the Contributing Guide)
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index 5f554d47e6..1e50799c19 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -183,6 +183,10 @@ Make your planned changes in the cloned ``dpdk`` repo. Here are some guidelines
 
 * Code and related documentation must be updated atomically in the same patch.
 
+* Consider running the :ref:`AI-assisted review <ai_assisted_review>` tool
+  before submitting to catch common issues early.
+  This is encouraged but not required.
+
 Once the changes have been made you should commit them to your local repo.
 
 For small changes, that do not require specific explanations, it is better to keep things together in the
@@ -503,6 +507,61 @@ Additionally, when contributing to the DTS tool, patches should also be checked
 the ``dts-check-format.sh`` script in the ``devtools`` directory of the DPDK repo.
 To run the script, extra :ref:`Python dependencies <dts_deps>` are needed.
 
+
+.. _ai_assisted_review:
+
+AI-Assisted Patch Review
+------------------------
+
+Contributors may optionally use the ``devtools/analyze-patch.py`` script
+to get an AI-assisted review of patches before submitting them to the mailing list.
+The script checks patches against the DPDK coding standards and contribution
+guidelines documented in ``AGENTS.md``.
+
+The script supports multiple AI providers (Anthropic Claude, OpenAI ChatGPT,
+xAI Grok, Google Gemini).  An API key for the chosen provider must be set
+in the corresponding environment variable (see ``--list-providers``).
+
+Basic usage::
+
+   # Review a single patch (default provider: Anthropic Claude)
+   devtools/analyze-patch.py my-patch.patch
+
+   # Use a different provider
+   devtools/analyze-patch.py -p openai my-patch.patch
+
+   # Review for an LTS branch (enables stricter rules)
+   devtools/analyze-patch.py -r 24.11 my-patch.patch
+
+   # List available providers and their API key variables
+   devtools/analyze-patch.py --list-providers
+
+For a patch series in an mbox file, the ``--split-patches`` option reviews
+each patch individually::
+
+   devtools/analyze-patch.py --split-patches series.mbox
+
+   # Review only a range of patches
+   devtools/analyze-patch.py --split-patches --patch-range 1-5 series.mbox
+
+When reviewing for a Long Term Stable (LTS) release, use the ``-r`` option
+with the target version.  Any DPDK release with minor version ``.11``
+(e.g., 23.11, 24.11) is automatically recognized as LTS,
+and the script will enforce stricter rules: bug fixes only, no new features or APIs.
+
+Output can be formatted as plain text (default), Markdown, HTML, or JSON::
+
+   devtools/analyze-patch.py -f markdown -o review.md my-patch.patch
+
+The review guidelines in ``AGENTS.md`` focus on correctness bug detection
+and other DPDK-specific requirements. Commit message formatting and
+SPDX/copyright compliance are checked by ``checkpatches.sh`` and are
+not duplicated in the AI review.
+
+.. note::
+
+   Always verify AI suggestions before acting on them.
+
 .. _contrib_check_compilation:
 
 Checking Compilation
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 6/6] MAINTAINERS: add section for AI review tools
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
                       ` (4 preceding siblings ...)
  2026-03-10  1:57     ` [PATCH v10 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
@ 2026-03-10  1:57     ` Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-10  1:57 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add maintainer entries for the AI-assisted code review tooling:
AGENTS.md, analyze-patch.py, compare-reviews.sh, and
review-doc.py.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5eb8e9dc22..7c1a84274f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -109,6 +109,14 @@ F: license/
 F: .editorconfig
 F: .mailmap
 
+AI review tools
+M: Stephen Hemminger <stephen@networkplumber.org>
+M: Aaron Conole <aconole@redhat.com>
+F: AGENTS.md
+F: devtools/analyze-patch.py
+F: devtools/compare-reviews.sh
+F: devtools/review-doc.py
+
 Linux kernel uAPI headers
 M: Maxime Coquelin <maxime.coquelin@redhat.com>
 F: devtools/linux-uapi.sh
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
                     ` (6 preceding siblings ...)
  2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
@ 2026-03-27 15:41   ` Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
                       ` (5 more replies)
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  9 siblings, 6 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-27 15:41 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add guidelines and tooling for AI-assisted code review of DPDK
patches.

AGENTS.md provides a two-tier review framework: correctness bugs
(resource leaks, use-after-free, race conditions) are reported at
>=50% confidence; style issues require >80% with false positive
suppression. Mechanical checks handled by checkpatches.sh are
excluded to avoid redundant findings.

The analyze-patch.py script supports multiple AI providers
(Anthropic, OpenAI, xAI, Google) with mbox splitting, prompt
caching, and direct SMTP sending.

v11 - add more checks related VLAN and mtu
      add checks for unsigned overflow on shifts

v10 - add more checks about mtu, buffer size and scatter
      based of Ferruh's revision in 2024.

v9 - update AGENTS to reduce false positives
   - remove commit message/SPDX items from prompt (checkpatch's job).
   - update contributing guide text to match actual AGENTS.md coverage.

Stephen Hemminger (6):
  doc: add AGENTS.md for AI code review tools
  devtools: add multi-provider AI patch review script
  devtools: add compare-reviews.sh for multi-provider analysis
  devtools: add multi-provider AI documentation review script
  doc: add AI-assisted patch review to contributing guide
  MAINTAINERS: add section for AI review tools

 AGENTS.md                              | 2162 ++++++++++++++++++++++++
 MAINTAINERS                            |    8 +
 devtools/analyze-patch.py              | 1348 +++++++++++++++
 devtools/compare-reviews.sh            |  192 +++
 devtools/review-doc.py                 | 1099 ++++++++++++
 doc/guides/contributing/new_driver.rst |    2 +
 doc/guides/contributing/patches.rst    |   59 +
 7 files changed, 4870 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 devtools/analyze-patch.py
 create mode 100755 devtools/compare-reviews.sh
 create mode 100755 devtools/review-doc.py

-- 
2.53.0


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v11 1/6] doc: add AGENTS.md for AI code review tools
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-03-27 15:41     ` Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-27 15:41 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Provide structured guidelines for AI tools reviewing DPDK
patches. Focuses on correctness bug detection (resource leaks,
use-after-free, race conditions), C coding style, forbidden
tokens, API conventions, and severity classifications.

Mechanical checks already handled by checkpatches.sh (SPDX
format, commit message formatting, tag ordering) are excluded
to avoid redundant and potentially contradictory findings.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 AGENTS.md | 2162 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2162 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..d49ed859f1
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,2162 @@
+# AGENTS.md - DPDK Code Review Guidelines for AI Tools
+
+## CRITICAL INSTRUCTION - READ FIRST
+
+This document has two categories of review rules with different
+confidence thresholds:
+
+### 1. Correctness Bugs -- HIGHEST PRIORITY (report at >=50% confidence)
+
+**Always report potential correctness bugs.** These are the most
+valuable findings. When in doubt, report them with a note about
+your confidence level. A possible use-after-free or resource leak
+is worth mentioning even if you are not certain.
+
+Correctness bugs include:
+- Use-after-free (accessing memory after `free`/`rte_free`)
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference
+- Buffer overflows or out-of-bounds access
+- Uninitialized variable use in a reachable code path
+- Race conditions (unsynchronized shared state)
+- `volatile` used instead of atomic operations for inter-thread shared variables
+- `__atomic_load_n()`/`__atomic_store_n()`/`__atomic_*()` GCC built-ins instead of `rte_atomic_*_explicit()`
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` legacy barriers instead of `rte_atomic_thread_fence()`
+- Missing error checks on functions that can fail
+- Error paths that skip cleanup (goto labels, missing free/close)
+- Incorrect error propagation (wrong return value, lost errno)
+- Logic errors in conditionals (wrong operator, inverted test)
+- Integer overflow/truncation in size calculations
+- Missing bounds checks on user-supplied sizes or indices
+- `mmap()` return checked against `NULL` instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=`
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied data
+- `1 << n` on 64-bit bitmask (must use `1ULL << n` or `RTE_BIT64()`)
+- Left shift of narrow unsigned (`uint8_t`/`uint16_t`) used as 64-bit value (sign extension via implicit `int` promotion)
+- Variable assigned then overwritten before being read (dead store)
+- Same variable used as loop counter in nested loops
+- `memcpy`/`memcmp`/`memset` with same pointer for source and destination (no-op or undefined)
+- `rte_mbuf_raw_free_bulk()` called on mbufs that may originate from different mempools (Tx burst, ring dequeue)
+- MTU confused with frame length (MTU is L3 payload; frame length = MTU + L2 overhead)
+- Using `dev_conf.rxmode.mtu` after configure instead of `dev->data->mtu`
+- Hardcoded Ethernet overhead instead of per-device calculation
+- MTU set without enabling `RTE_ETH_RX_OFFLOAD_SCATTER` when frame size exceeds mbuf data room
+- `mtu_set` callback rejects valid MTU when scatter Rx is already enabled
+- Rx queue setup silently drops oversized packets instead of enabling scatter or returning an error
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size check
+
+**Do NOT self-censor correctness bugs.** If you identify a code
+path where a resource could leak or memory could be used after
+free, report it. Do not talk yourself out of it.
+
+### 2. Style, Process, and Formatting -- suppress false positives
+
+**NEVER list a style/process item under "Errors" or "Warnings" if
+you conclude it is correct.**
+
+Before outputting any style, formatting, or process error/warning,
+verify it is actually wrong. If your analysis concludes with
+phrases like "there's no issue here", "which is fine", "appears
+correct", "is acceptable", or "this is actually correct" -- then
+DO NOT INCLUDE IT IN YOUR OUTPUT AT ALL. Delete it. Omit it
+entirely.
+
+This suppression rule applies to: naming conventions,
+code style, and process compliance. It does NOT apply to
+correctness bugs listed above. (SPDX/copyright format and
+commit message formatting are handled by checkpatch and are
+excluded from AI review entirely.)
+
+---
+
+This document provides guidelines for AI-powered code review tools
+when reviewing contributions to the Data Plane Development Kit
+(DPDK). It is derived from the official DPDK contributor guidelines
+and validation scripts.
+
+## Overview
+
+DPDK follows a development process modeled on the Linux Kernel. All
+patches are reviewed publicly on the mailing list before being
+merged. AI review tools should verify compliance with the standards
+outlined below.
+
+## Review Philosophy
+
+**Correctness bugs are the primary goal of AI review.** Style and
+formatting checks are secondary. A review that catches a
+use-after-free but misses a style nit is far more valuable than
+one that catches every style issue but misses the bug.
+
+**BEFORE OUTPUTTING YOUR REVIEW**: Re-read each item.
+- For correctness bugs: keep them. If you have reasonable doubt
+  that a code path is safe, report it.
+- For style/process items: if ANY item contains phrases like "is
+  fine", "no issue", "appears correct", "is acceptable",
+  "actually correct" -- DELETE THAT ITEM. Do not include it.
+
+### Correctness review guidelines
+- Trace error paths: for every function that allocates a resource
+  or acquires a lock, verify that ALL error paths after that point
+  release it
+- Check every `goto error` and early `return`: does it clean up
+  everything allocated so far?
+- Look for use-after-free: after `free(p)`, is `p` accessed again?
+- Check that error codes are propagated, not silently dropped
+- Report at >=50% confidence; note uncertainty if appropriate
+- It is better to report a potential bug that turns out to be safe
+  than to miss a real bug
+
+### Style and process review guidelines
+- Only comment on style/process issues when you have HIGH CONFIDENCE (>80%) that an issue exists
+- Be concise: one sentence per comment when possible
+- Focus on actionable feedback, not observations
+- When reviewing text, only comment on clarity issues if the text is genuinely
+  confusing or could lead to errors.
+- Do NOT comment on copyright years, SPDX format, or copyright holders - not subject to AI review
+- Do NOT report an issue then contradict yourself - if something is acceptable, do not mention it at all
+- Do NOT include items in Errors/Warnings that you then say are "acceptable" or "correct"
+- Do NOT mention things that are correct or "not an issue" - only report actual problems
+- Do NOT speculate about contributor circumstances (employment, company policies, etc.)
+- Before adding any style item to your review, ask: "Is this actually wrong?" If no, omit it entirely.
+- NEVER write "(Correction: ...)" - if you need to correct yourself, simply omit the item entirely
+- Do NOT add vague suggestions like "should be verified" or "should be checked" - either it's wrong or don't mention it
+- Do NOT flag something as an Error then say "which is correct" in the same item
+- Do NOT say "no issue here" or "this is actually correct" - if there's no issue, do not include it in your review
+- Do NOT analyze cross-patch dependencies or compilation order - you cannot reliably determine this from patch review
+- Do NOT claim a patch "would cause compilation failure" based on symbols used in other patches in the series
+- Review each patch individually for its own correctness; assume the patch author ordered them correctly
+- When reviewing a patch series, OMIT patches that have no issues. Do not include a patch in your output just to say "no issues found" or to summarize what the patch does. Only include patches where you have actual findings to report.
+
+## Priority Areas (Review These)
+
+### Security & Safety
+- Unsafe code blocks without justification
+- Command injection risks (shell commands, user input)
+- Path traversal vulnerabilities
+- Credential exposure or hard coded secrets
+- Missing input validation on external data
+- Improper error handling that could leak sensitive info
+
+### Correctness Issues
+- Logic errors that could cause panics or incorrect behavior
+- Buffer overflows
+- Race conditions
+- **`volatile` for inter-thread synchronization**: `volatile` does not
+  provide atomicity or memory ordering between threads. Use
+  `rte_atomic_load_explicit()`/`rte_atomic_store_explicit()` with
+  appropriate `rte_memory_order_*` instead. See the Shared Variable
+  Access section under Forbidden Tokens for details.
+- Resource leaks (files, connections, memory)
+- Off-by-one errors or boundary conditions
+- Incorrect error propagation
+- **Use-after-free** (any access to memory after it has been freed)
+- **Error path resource leaks**: For every allocation or fd open,
+  trace each error path (`goto`, early `return`, conditional) to
+  verify the resource is released. Common patterns to check:
+  - `malloc`/`rte_malloc` followed by a failure that does `return -1`
+    instead of `goto cleanup`
+  - `open()`/`socket()` fd not closed on a later error
+  - Lock acquired but not released on an error branch
+  - Partially initialized structure where early fields are allocated
+    but later allocation fails without freeing the early ones
+- **Double-free / double-close**: resource freed in both a normal
+  path and an error path, or fd closed but not set to -1 allowing
+  a second close
+- **Missing error checks**: functions that can fail (malloc, open,
+  ioctl, etc.) whose return value is not checked
+- Changes to API without release notes
+- Changes to ABI on non-LTS release
+- Usage of deprecated APIs when replacements exist
+- Overly defensive code that adds unnecessary checks
+- Unnecessary comments that just restate what the code already shows (remove them)
+- **Process-shared synchronization errors** (pthread mutexes in shared memory without `PTHREAD_PROCESS_SHARED`)
+- **`mmap()` checked against NULL instead of `MAP_FAILED`**: `mmap()` returns
+  `MAP_FAILED` (i.e., `(void *)-1`) on failure, NOT `NULL`. Checking
+  `== NULL` or `!= NULL` will miss the error and use an invalid pointer.
+  ```c
+  /* BAD - mmap never returns NULL on failure */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == NULL)       /* WRONG - will not catch MAP_FAILED */
+      return -1;
+
+  /* GOOD */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == MAP_FAILED)
+      return -1;
+  ```
+- **Statistics accumulation using `=` instead of `+=`**: When accumulating
+  statistics (counters, byte totals, packet counts), using `=` overwrites
+  the running total with only the latest value. This silently produces
+  wrong results.
+  ```c
+  /* BAD - overwrites instead of accumulating */
+  stats->rx_packets = nb_rx;
+  stats->rx_bytes = total_bytes;
+
+  /* GOOD - accumulates over time */
+  stats->rx_packets += nb_rx;
+  stats->rx_bytes += total_bytes;
+  ```
+  Note: `=` is correct for gauge-type values (e.g., queue depth, link
+  status) and for initial assignment. Only flag when the context is
+  clearly incremental accumulation (loop bodies, per-burst counters,
+  callback tallies).
+- **Integer multiply without widening cast**: When multiplying integers
+  to produce a result wider than the operands (sizes, offsets, byte
+  counts), the multiplication is performed at the operand width and
+  the upper bits are silently lost before the assignment. This applies
+  to any narrowing scenario: 16×16 assigned to a 32-bit variable,
+  32×32 assigned to a 64-bit variable, etc.
+  ```c
+  /* BAD - 32×32 overflows before widening to 64 */
+  uint64_t total_size = num_entries * entry_size;  /* both are uint32_t */
+  size_t offset = ring->idx * ring->desc_size;     /* 32×32 → truncated */
+
+  /* BAD - 16×16 overflows before widening to 32 */
+  uint32_t byte_count = pkt_len * nb_segs;         /* both are uint16_t */
+
+  /* GOOD - widen before multiply */
+  uint64_t total_size = (uint64_t)num_entries * entry_size;
+  size_t offset = (size_t)ring->idx * ring->desc_size;
+  uint32_t byte_count = (uint32_t)pkt_len * nb_segs;
+  ```
+- **Unbounded descriptor chain traversal**: When walking a chain of
+  descriptors (virtio, DMA, NIC Rx/Tx rings) where the chain length
+  or next-index comes from guest memory or an untrusted API caller,
+  the traversal MUST have a bounds check or loop counter to prevent
+  infinite loops or out-of-bounds access from malicious/corrupt data.
+  ```c
+  /* BAD - guest controls desc[idx].next with no bound */
+  while (desc[idx].flags & VRING_DESC_F_NEXT) {
+      idx = desc[idx].next;          /* guest-supplied, unbounded */
+      process(desc[idx]);
+  }
+
+  /* GOOD - cap iterations to descriptor ring size */
+  for (i = 0; i < ring_size; i++) {
+      if (!(desc[idx].flags & VRING_DESC_F_NEXT))
+          break;
+      idx = desc[idx].next;
+      if (idx >= ring_size)          /* bounds check */
+          return -EINVAL;
+      process(desc[idx]);
+  }
+  ```
+  This applies to any chain/linked-list traversal where indices or
+  pointers originate from untrusted input (guest VMs, user-space
+  callers, network packets).
+- **Bitmask shift using `1` instead of `1ULL` on 64-bit masks**: The
+  literal `1` is `int` (32 bits). Shifting it by 32 or more is
+  undefined behavior; shifting it by less than 32 but assigning to a
+  `uint64_t` silently zeroes the upper 32 bits. Use `1ULL << n`,
+  `UINT64_C(1) << n`, or the DPDK `RTE_BIT64(n)` macro.
+  ```c
+  /* BAD - 1 is int, UB if n >= 32, wrong if result used as uint64_t */
+  uint64_t mask = 1 << bit_pos;
+  if (features & (1 << VIRTIO_NET_F_MRG_RXBUF))  /* bit 15 OK, bit 32+ UB */
+
+  /* GOOD */
+  uint64_t mask = UINT64_C(1) << bit_pos;
+  uint64_t mask = 1ULL << bit_pos;
+  uint64_t mask = RTE_BIT64(bit_pos);        /* preferred in DPDK */
+  if (features & RTE_BIT64(VIRTIO_NET_F_MRG_RXBUF))
+  ```
+  Note: `1U << n` is acceptable when the mask is known to be 32-bit
+  (e.g., `uint32_t` register fields with `n < 32`). Only flag when
+  the result is stored in, compared against, or returned as a 64-bit
+  type, or when `n` could be >= 32.
+- **Left shift of narrow unsigned type sign-extends to 64-bit**: When
+  a `uint8_t` or `uint16_t` value is left-shifted, C integer promotion
+  converts it to `int` (signed 32-bit) before the shift. If the result
+  has bit 31 set, implicit conversion to `uint64_t`, `size_t`, or use
+  in pointer arithmetic sign-extends the upper 32 bits to all-1s,
+  producing a wrong address or value. This is Coverity SIGN_EXTENSION.
+  The fix is to cast the narrow operand to an unsigned type at least as
+  wide as the target before shifting.
+  ```c
+  /* BAD - uint16_t promotes to signed int, bit 31 may set,
+   * then sign-extends when converted to 64-bit for pointer math */
+  uint16_t idx = get_index();
+  void *addr = base + (idx << wqebb_shift);      /* SIGN_EXTENSION */
+  uint64_t off = (uint64_t)(idx << shift);        /* too late: shift already in int */
+
+  /* BAD - uint8_t shift with result used as size_t */
+  uint8_t page_order = get_order();
+  size_t size = page_order << PAGE_SHIFT;          /* promotes to int first */
+
+  /* GOOD - cast before shift */
+  void *addr = base + ((uint64_t)idx << wqebb_shift);
+  uint64_t off = (uint64_t)idx << shift;
+  size_t size = (size_t)page_order << PAGE_SHIFT;
+
+  /* GOOD - intermediate unsigned variable */
+  uint32_t offset = (uint32_t)idx << wqebb_shift;  /* OK if result fits 32 bits */
+  ```
+  Note: This is distinct from the `1 << n` pattern (where the literal
+  `1` is the problem) and from the integer-multiply pattern (where
+  the operation is `*` not `<<`). The mechanism is the same C integer
+  promotion rule, but the code patterns and Coverity checker names
+  differ. Only flag when the shift result is used in a context wider
+  than 32 bits (64-bit assignment, pointer arithmetic, function
+  argument expecting `uint64_t`/`size_t`). A shift whose result is
+  stored in a `uint32_t` or narrower variable is not affected.
+- **Variable overwrite before read (dead store)**: A variable is
+  assigned a value that is unconditionally overwritten before it is
+  ever read. This usually indicates a logic error (wrong variable
+  name, missing `if`, copy-paste mistake) or at minimum is dead code.
+  ```c
+  /* BAD - first assignment is never read */
+  ret = validate_input(cfg);
+  ret = apply_config(cfg);     /* overwrites without checking first ret */
+  if (ret != 0)
+      return ret;
+
+  /* GOOD - check each return value */
+  ret = validate_input(cfg);
+  if (ret != 0)
+      return ret;
+  ret = apply_config(cfg);
+  if (ret != 0)
+      return ret;
+  ```
+  Do NOT flag cases where the initial value is intentionally a default
+  that may or may not be overwritten (e.g., `int ret = 0;` followed
+  by a conditional assignment). Only flag unconditional overwrites
+  where the first value can never be observed.
+- **Shared loop counter in nested loops**: Using the same variable as
+  the loop counter in both an outer and inner loop causes the outer
+  loop to malfunction because the inner loop modifies its counter.
+  ```c
+  /* BAD - inner loop clobbers outer loop counter */
+  int i;
+  for (i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (i = 0; i < nb_descs; i++)    /* BUG: reuses i */
+          init_desc(i);
+  }
+
+  /* GOOD - distinct loop counters */
+  for (int i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (int j = 0; j < nb_descs; j++)
+          init_desc(j);
+  }
+  ```
+- **`memcpy`/`memcmp`/`memset` self-argument (same pointer as both
+  operands)**: Passing the same pointer as both source and destination
+  to `memcpy()` is undefined behavior per C99. Passing the same
+  pointer to both arguments of `memcmp()` is a no-op that always
+  returns 0, indicating a logic error (usually a copy-paste mistake
+  with the wrong variable name). The same applies to `rte_memcpy()`
+  and `memmove()` with identical arguments.
+  ```c
+  /* BAD - memcpy with same src and dst is undefined behavior */
+  memcpy(buf, buf, len);
+  rte_memcpy(dst, dst, len);
+
+  /* BAD - memcmp with same pointer always returns 0 (logic error) */
+  if (memcmp(key, key, KEY_LEN) == 0)  /* always true, wrong variable? */
+
+  /* BAD - likely copy-paste: should be comparing two different MACs */
+  if (memcmp(&eth->src_addr, &eth->src_addr, RTE_ETHER_ADDR_LEN) == 0)
+
+  /* GOOD - comparing two different things */
+  memcpy(dst, src, len);
+  if (memcmp(&eth->src_addr, &eth->dst_addr, RTE_ETHER_ADDR_LEN) == 0)
+  ```
+  This pattern almost always indicates a copy-paste bug where one of
+  the arguments should be a different variable.
+- **`rte_mbuf_raw_free_bulk()` on mixed-pool mbuf arrays**: Tx burst functions
+  and ring/queue dequeue paths receive mbufs that may originate from different
+  mempools (applications are free to send mbufs from any pool).
+  `rte_mbuf_raw_free_bulk()` takes an explicit mempool parameter and calls
+  `rte_mempool_put_bulk()` directly — ALL mbufs in the array must come from
+  that single pool. If mbufs come from different pools, they are returned to
+  the wrong pool, corrupting pool accounting and causing hard-to-debug failures.
+  Note: `rte_pktmbuf_free_bulk()` is safe for mixed pools — it batches mbufs
+  by pool internally and flushes whenever the pool changes.
+  ```c
+  /* BAD - assumes all mbufs are from the same pool */
+  /* (in tx_burst completion or ring dequeue error path) */
+  rte_mbuf_raw_free_bulk(mp, mbufs, nb_mbufs);
+
+  /* GOOD - rte_pktmbuf_free_bulk handles mixed pools correctly */
+  rte_pktmbuf_free_bulk(mbufs, nb_mbufs);
+
+  /* GOOD - free individually (each mbuf returned to its own pool) */
+  for (i = 0; i < nb_mbufs; i++)
+      rte_pktmbuf_free(mbufs[i]);
+  ```
+  This applies to any path that frees mbufs submitted by the application:
+  Tx completion, Tx error cleanup, and ring/queue drain paths.
+  `rte_mbuf_raw_free_bulk()` is an optimization for the fast-free case
+  (`RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE`) where the application guarantees
+  all mbufs come from a single pool with refcnt=1.
+- **MTU confused with Ethernet frame length**: Maximum Transmission Unit
+  (MTU) is the maximum L3 payload size (e.g., 1500 bytes for standard
+  Ethernet). The maximum Ethernet *frame length* includes L2 overhead:
+  Ethernet header (14 bytes) + optional VLAN tags (4 bytes each) + CRC
+  (4 bytes). The overhead varies per device depending on supported
+  encapsulations (VLAN, QinQ, etc.). Confusing MTU with frame length
+  produces off-by-14-to-22-byte errors in packet size limits, buffer
+  sizing, and scattered Rx decisions.
+
+  **VLAN tag accounting:** The outer VLAN tag is L2 overhead and does
+  NOT count toward MTU (matching Linux and FreeBSD). A 1522-byte
+  single-tagged frame is valid at MTU 1500. However, in QinQ the
+  inner (customer) tag DOES consume MTU — it is part of the customer
+  frame. So QinQ with MTU 1500 allows only 1496 bytes of L3 payload
+  unless the port MTU is raised to 1504.
+
+  **Using `rxmode.mtu` after configure:** After `rte_eth_dev_configure()`
+  completes, the canonical MTU is stored in `dev->data->mtu`. The
+  `dev->data->dev_conf.rxmode.mtu` field is the user's *request* and
+  must not be read after configure — it becomes stale if
+  `rte_eth_dev_set_mtu()` is called later. Both configure and set_mtu
+  write to `dev->data->mtu`; PMDs should always read from there.
+
+  **Overhead calculation:** Do not hardcode a single overhead constant.
+  Use the device's own overhead calculation (typically available via
+  `dev_info.max_rx_pktlen - dev_info.max_mtu` or an internal
+  `eth_overhead` field). Different devices support different
+  encapsulations, so the overhead is not a universal constant.
+
+  **Scattered Rx decision:** PMDs compare maximum frame length
+  (MTU + per-device overhead) against Rx buffer size to decide
+  whether scattered Rx is needed. Comparing raw MTU against buffer
+  size is wrong — it underestimates the actual frame size by the
+  overhead.
+  ```c
+  /* BAD - MTU used where frame length is needed */
+  if (dev->data->mtu > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* BAD - hardcoded overhead, wrong for QinQ-capable devices */
+  #define ETHER_OVERHEAD 18  /* may be 22 or 26 for VLAN/QinQ */
+  max_frame = mtu + ETHER_OVERHEAD;
+
+  /* BAD - reading rxmode.mtu after configure (stale if set_mtu called) */
+  static int
+  mydrv_rx_queue_setup(...) {
+      mtu = dev->data->dev_conf.rxmode.mtu;  /* WRONG - may be stale */
+      ...
+  }
+
+  /* GOOD - use dev->data->mtu, the canonical post-configure value */
+  static int
+  mydrv_rx_queue_setup(...) {
+      uint16_t mtu = dev->data->mtu;
+      ...
+  }
+
+  /* GOOD - use per-device overhead for frame length calculation */
+  uint32_t frame_overhead = dev_info.max_rx_pktlen - dev_info.max_mtu;
+  uint32_t max_frame_len = dev->data->mtu + frame_overhead;
+  if (max_frame_len > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* GOOD - device-specific overhead constant derived from capabilities */
+  static uint32_t
+  mydrv_eth_overhead(struct rte_eth_dev *dev) {
+      uint32_t overhead = RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_VLAN)
+          overhead += RTE_VLAN_HLEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_QINQ)
+          overhead += RTE_VLAN_HLEN;
+      return overhead;
+  }
+  ```
+  Note: In `rte_eth_dev_configure()` itself, reading `rxmode.mtu` is
+  correct — that is where the user's request is consumed and written
+  to `dev->data->mtu`. Only flag reads of `rxmode.mtu` *outside*
+  configure (queue setup, start, link update, MTU set, etc.).
+- **Missing scatter Rx for large MTU**: When the configured MTU
+  produces a frame size (MTU + Ethernet overhead) larger than the mbuf
+  data buffer size (`rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`),
+  the PMD MUST either enable scatter Rx (multi-segment receive) or reject
+  the configuration. Silently accepting the MTU and then truncating or
+  dropping oversized packets is a correctness bug.
+  ```c
+  /* BAD - accepts MTU but will truncate packets that don't fit */
+  static int
+  mydrv_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+  {
+      /* No check against mbuf size or scatter capability */
+      dev->data->mtu = mtu;
+      return 0;
+  }
+
+  /* BAD - rejects valid MTU even though scatter is enabled */
+  if (frame_size > mbuf_data_size)
+      return -EINVAL;  /* wrong: should allow if scatter is on */
+
+  /* GOOD - check scatter and mbuf size */
+  if (!dev->data->scattered_rx &&
+      frame_size > dev->data->min_rx_buf_size - RTE_PKTMBUF_HEADROOM)
+      return -EINVAL;
+
+  /* GOOD - auto-enable scatter when needed */
+  if (frame_size > mbuf_data_size) {
+      if (!(dev_info.rx_offload_capa & RTE_ETH_RX_OFFLOAD_SCATTER))
+          return -EINVAL;
+      dev->data->dev_conf.rxmode.offloads |=
+          RTE_ETH_RX_OFFLOAD_SCATTER;
+      dev->data->scattered_rx = 1;
+  }
+  ```
+  Key relationships:
+  - `dev_info.max_rx_pktlen`: maximum frame the hardware can receive
+  - `dev_info.max_mtu`: maximum MTU = `max_rx_pktlen` - overhead
+  - `dev_info.min_rx_bufsize`: minimum Rx buffer the HW requires
+  - `dev_info.max_rx_bufsize`: maximum single-descriptor buffer size
+  - `mbuf data size = rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`
+  - When scatter is off: frame length must fit in a single mbuf
+  - When scatter is on: frame length can span multiple mbufs;
+    the PMD selects a scattered Rx function
+
+  This pattern should be checked in three places:
+  1. `dev_configure()` -- validate MTU against mbuf size / scatter
+  2. `rx_queue_setup()` -- select scattered vs non-scattered Rx path
+  3. `mtu_set()` -- runtime MTU change must re-validate
+- **Rx queue function selection ignoring scatter**: When a PMD has
+  separate fast-path Rx functions for scalar (single-segment) and
+  scattered (multi-segment) modes, it must select the scattered
+  variant whenever `dev->data->scattered_rx` is set OR when the
+  configured frame length exceeds the single mbuf data size.
+  Failing to do so causes the scalar Rx function to silently drop
+  or corrupt multi-segment packets.
+  ```c
+  /* BAD - only checks offload flag, ignores actual need */
+  if (rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER)
+      rx_func = mydrv_recv_scattered;
+  else
+      rx_func = mydrv_recv_single;  /* will drop oversized pkts */
+
+  /* GOOD - check both the flag and the size */
+  mbuf_size = rte_pktmbuf_data_room_size(rxq->mp) -
+              RTE_PKTMBUF_HEADROOM;
+  max_pkt = dev->data->mtu + overhead;
+  if ((rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER) ||
+      max_pkt > mbuf_size) {
+      dev->data->scattered_rx = 1;
+      rx_func = mydrv_recv_scattered;
+  } else {
+      rx_func = mydrv_recv_single;
+  }
+  ```
+
+### Architecture & Patterns
+- Code that violates existing patterns in the code base
+- Missing error handling
+- Code that is not safe against signals
+- **Environment variables used for driver configuration instead of devargs**:
+  Drivers must use DPDK device arguments (`devargs`) for runtime
+  configuration, not environment variables. Devargs are preferred because
+  they are obviously device-specific rather than having global impact,
+  some launch methods strip all environment variables, and devargs can
+  be associated on a per-device basis rather than per-device-type.
+  Use `rte_kvargs_parse()` on the devargs string instead.
+  ```c
+  /* BAD - environment variable for driver tuning */
+  val = getenv("MYDRV_RX_BURST_SIZE");
+  if (val != NULL)
+      burst = atoi(val);
+
+  /* GOOD - devargs parsed at probe time */
+  static const char * const valid_args[] = { "rx_burst_size", NULL };
+  kvlist = rte_kvargs_parse(devargs->args, valid_args);
+  rte_kvargs_process(kvlist, "rx_burst_size", &parse_uint, &burst);
+  ```
+  Note: `getenv()` in EAL itself or in test/example code is acceptable.
+  This rule applies to libraries under `lib/` and drivers under `drivers/`.
+
+### New Library API Design
+
+When a patch adds a new library under `lib/`, review API design in
+addition to correctness and style.
+
+**API boundary.** A library should be a compiler, not a framework.
+The model is `rte_acl`: create a context, feed input, get structured
+output, caller decides what to do with it. No callbacks needed. If
+the library requires callers to implement a callback table to
+function, the boundary is wrong — the library is asking the caller
+to be its backend.
+
+**Callback structs** (Warning / Error). Any function-pointer struct
+in an installed header is an ABI break waiting to happen. Adding or
+reordering a member breaks all consumers.
+- Prefer a single callback parameter over an ops table.
+- \>5 callbacks: **Warning** — likely needs redesign.
+- \>20 callbacks: **Error** — this is an app plugin API, not a library.
+- All callbacks must have Doxygen (contract, return values, ownership).
+- Void-returning callbacks for failable operations swallow errors —
+  flag as **Error**.
+- Callbacks serving app-specific needs (e.g. `verbose_level_get`)
+  indicate wrong code was extracted into the library.
+
+**Extensible structures.** Prefer TLV / tagged-array patterns over
+enum + union, following `rte_flow_item` and `rte_flow_action` as
+the model. Type tag + pointer to type-specific data allows adding
+types without ABI breaks. Flag as **Warning**:
+- Large enums (100+) consumers must switch on.
+- Unions that grow with every new feature.
+- Ask: "What changes when a feature is added next release?" If
+  "add an enum value and union arm" — should be TLV.
+
+**Installed headers.** If it's in `headers` or `indirect_headers`
+in meson.build, it's public API. Don't call it "private." If truly
+internal, don't install it.
+
+**Global state.** Prefer handle-based APIs (`create`/`destroy`)
+over singletons. `rte_acl` allows multiple independent classifier
+instances; new libraries should do the same.
+
+**Output ownership.** Prefer caller-allocated or library-allocated-
+caller-freed over internal static buffers. If static buffers are
+used, document lifetime and ensure Doxygen examples don't show
+stale-pointer usage.
+
+---
+
+## C Coding Style
+
+### General Formatting
+
+- **Tab width**: 8 characters (hard tabs for indentation, spaces for alignment)
+- **No trailing whitespace** on lines or at end of files
+- Files must end with a new line
+- Code style should be consistent within each file
+
+
+### Comments
+
+```c
+/* Most single-line comments look like this. */
+
+/*
+ * VERY important single-line comments look like this.
+ */
+
+/*
+ * Multi-line comments look like this. Make them real sentences. Fill
+ * them so they look like real paragraphs.
+ */
+```
+
+### Header File Organization
+
+Include order (each group separated by blank line):
+1. System/libc includes
+2. DPDK EAL includes
+3. DPDK misc library includes
+4. Application-specific includes
+
+```c
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <rte_eal.h>
+
+#include <rte_ring.h>
+#include <rte_mempool.h>
+
+#include "application.h"
+```
+
+### Header Guards
+
+```c
+#ifndef _FILE_H_
+#define _FILE_H_
+
+/* Code */
+
+#endif /* _FILE_H_ */
+```
+
+### Naming Conventions
+
+- **All external symbols** must have `RTE_` or `rte_` prefix
+- **Macros**: ALL_UPPERCASE with `RTE_` prefix
+- **Functions**: lowercase with underscores only (no CamelCase)
+- **Variables**: lowercase with underscores only
+- **Enum values**: ALL_UPPERCASE with `RTE_<ENUM>_` prefix
+
+**Exception**: Driver base directories (`drivers/*/base/`) may use different
+naming conventions when sharing code across platforms or with upstream vendor code.
+
+#### Symbol Naming for Static Linking
+
+Drivers and libraries must not expose global variables that could
+clash when statically linked with other DPDK components or
+applications. Use consistent and unique prefixes for all exported
+symbols to avoid namespace collisions.
+
+**Good practice**: Use a driver-specific or library-specific prefix for all global variables:
+
+```c
+/* Good - virtio driver uses consistent "virtio_" prefix */
+const struct virtio_ops virtio_legacy_ops = {
+	.read = virtio_legacy_read,
+	.write = virtio_legacy_write,
+	.configure = virtio_legacy_configure,
+};
+
+const struct virtio_ops virtio_modern_ops = {
+	.read = virtio_modern_read,
+	.write = virtio_modern_write,
+	.configure = virtio_modern_configure,
+};
+
+/* Good - mlx5 driver uses consistent "mlx5_" prefix */
+struct mlx5_flow_driver_ops mlx5_flow_dv_ops;
+```
+
+**Bad practice**: Generic names that may clash:
+
+```c
+/* Bad - "ops" is too generic, will clash with other drivers */
+const struct virtio_ops ops = { ... };
+
+/* Bad - "legacy_ops" could clash with other legacy implementations */
+const struct virtio_ops legacy_ops = { ... };
+
+/* Bad - "driver_config" is not unique */
+struct driver_config config;
+```
+
+**Guidelines**:
+- Prefix all global variables with the driver or library name (e.g., `virtio_`, `mlx5_`, `ixgbe_`)
+- Prefix all global functions similarly unless they use the `rte_` namespace
+- Internal static variables do not require prefixes as they have file scope
+- Consider using the `RTE_` or `rte_` prefix only for symbols that are part of the public DPDK API
+
+#### Prohibited Terminology
+
+Do not use non-inclusive naming including:
+- `master/slave` -> Use: primary/secondary, controller/worker, leader/follower
+- `blacklist/whitelist` -> Use: denylist/allowlist, blocklist/passlist
+- `cripple` -> Use: impacted, degraded, restricted, immobilized
+- `tribe` -> Use: team, squad
+- `sanity check` -> Use: coherence check, test, verification
+
+
+### Comparisons and Boolean Logic
+
+```c
+/* Pointers - compare explicitly with NULL */
+if (p == NULL)      /* Good */
+if (p != NULL)      /* Good */
+if (likely(p != NULL))   /* Good - likely/unlikely don't change this */
+if (unlikely(p == NULL)) /* Good - likely/unlikely don't change this */
+if (!p)             /* Bad - don't use ! on pointers */
+
+/* Integers - compare explicitly with zero */
+if (a == 0)         /* Good */
+if (a != 0)         /* Good */
+if (errno != 0)     /* Good - this IS explicit */
+if (likely(a != 0)) /* Good - likely/unlikely don't change this */
+if (!a)             /* Bad - don't use ! on integers */
+if (a)              /* Bad - implicit, should be a != 0 */
+
+/* Characters - compare with character constant */
+if (*p == '\0')     /* Good */
+
+/* Booleans - direct test is acceptable */
+if (flag)           /* Good for actual bool types */
+if (!flag)          /* Good for actual bool types */
+```
+
+**Explicit comparison** means using `==` or `!=` operators (e.g., `x != 0`, `p == NULL`).
+**Implicit comparison** means relying on truthiness without an operator (e.g., `if (x)`, `if (!p)`).
+**Note**: `likely()` and `unlikely()` macros do NOT affect whether a comparison is explicit or implicit.
+
+### Boolean Usage
+
+Prefer `bool` (from `<stdbool.h>`) over `int` for variables,
+parameters, and return values that are purely true/false. Using
+`bool` makes intent explicit, enables compiler diagnostics for
+misuse, and is self-documenting.
+
+```c
+/* Bad - int used as boolean flag */
+int verbose = 0;
+int is_enabled = 1;
+
+int
+check_valid(struct item *item)
+{
+	if (item->flags & ITEM_VALID)
+		return 1;
+	return 0;
+}
+
+/* Good - bool communicates intent */
+bool verbose = false;
+bool is_enabled = true;
+
+bool
+check_valid(struct item *item)
+{
+	return item->flags & ITEM_VALID;
+}
+```
+
+**Guidelines:**
+- Use `bool` for variables that only hold true/false values
+- Use `bool` return type for predicate functions (functions that
+  answer a yes/no question, often named `is_*`, `has_*`, `can_*`)
+- Use `true`/`false` rather than `1`/`0` for boolean assignments
+- Boolean variables and parameters should not use explicit
+  comparison: `if (verbose)` is correct, not `if (verbose == true)`
+- `int` is still appropriate when a value can be negative, is an
+  error code, or carries more than two states
+
+**Structure fields:**
+- `bool` occupies 1 byte. In packed or cache-critical structures,
+  consider using a bitfield or flags word instead
+- For configuration structures and non-hot-path data, `bool` is
+  preferred over `int` for flag fields
+
+```c
+/* Bad - int flags waste space and obscure intent */
+struct port_config {
+	int promiscuous;     /* 0 or 1 */
+	int link_up;         /* 0 or 1 */
+	int autoneg;         /* 0 or 1 */
+	uint16_t mtu;
+};
+
+/* Good - bool for flag fields */
+struct port_config {
+	bool promiscuous;
+	bool link_up;
+	bool autoneg;
+	uint16_t mtu;
+};
+
+/* Also good - bitfield for cache-critical structures */
+struct fast_path_config {
+	uint32_t flags;      /* bitmask of CONFIG_F_* */
+	/* ... hot-path fields ... */
+};
+```
+
+**Do NOT flag:**
+- `int` return type for functions that return error codes (0 for
+  success, negative for error) — these are NOT boolean
+- `int` used for tri-state or multi-state values
+- `int` flags in existing code where changing the type would be a
+  large, unrelated refactor
+- Bitfield or flags-word approaches in performance-critical
+  structures
+
+### Indentation and Braces
+
+```c
+/* Control statements - no braces for single statements */
+if (val != NULL)
+	val = realloc(val, newsize);
+
+/* Braces on same line as else */
+if (test)
+	stmt;
+else if (bar) {
+	stmt;
+	stmt;
+} else
+	stmt;
+
+/* Switch statements - don't indent case */
+switch (ch) {
+case 'a':
+	aflag = 1;
+	/* FALLTHROUGH */
+case 'b':
+	bflag = 1;
+	break;
+default:
+	usage();
+}
+
+/* Long conditions - double indent continuation */
+if (really_long_variable_name_1 == really_long_variable_name_2 &&
+		really_long_variable_name_3 == really_long_variable_name_4)
+	stmt;
+```
+
+### Variable Declarations
+
+- Prefer declaring variables inside the basic block where they are used
+- Variables may be declared either at the start of the block, or at point of first use (C99 style)
+- Both declaration styles are acceptable; consistency within a function is preferred
+- Initialize variables only when a meaningful value exists at declaration time
+- Use C99 designated initializers for structures
+
+```c
+/* Good - declaration at start of block */
+int ret;
+ret = some_function();
+
+/* Also good - declaration at point of use (C99 style) */
+for (int i = 0; i < count; i++)
+	process(i);
+
+/* Good - declaration in inner block where variable is used */
+if (condition) {
+	int local_val = compute();
+	use(local_val);
+}
+
+/* Bad - unnecessary initialization defeats compiler warnings */
+int ret = 0;
+ret = some_function();    /* Compiler won't warn if assignment removed */
+```
+
+### Function Format
+
+- Return type on its own line
+- Opening brace on its own line
+- Place an empty line between declarations and statements
+
+```c
+static char *
+function(int a1, int b1)
+{
+	char *p;
+
+	p = do_something(a1, b1);
+	return p;
+}
+```
+
+---
+
+## Unnecessary Code Patterns
+
+The following patterns add unnecessary code, hide bugs, or reduce performance. Avoid them.
+
+### Unnecessary Variable Initialization
+
+Do not initialize variables that will be assigned before use. This defeats the compiler's uninitialized variable warnings, hiding potential bugs.
+
+```c
+/* Bad - initialization defeats -Wuninitialized */
+int ret = 0;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - compiler will warn if any path misses assignment */
+int ret;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - meaningful initial value */
+int count = 0;
+for (i = 0; i < n; i++)
+	if (test(i))
+		count++;
+```
+
+### Unnecessary Casts of void *
+
+In C, `void *` converts implicitly to any pointer type. Casting the result of `malloc()`, `calloc()`, `rte_malloc()`, or similar functions is unnecessary and can hide the error of a missing `#include <stdlib.h>`.
+
+```c
+/* Bad - unnecessary cast */
+struct foo *p = (struct foo *)malloc(sizeof(*p));
+struct bar *q = (struct bar *)rte_malloc(NULL, sizeof(*q), 0);
+
+/* Good - no cast needed in C */
+struct foo *p = malloc(sizeof(*p));
+struct bar *q = rte_malloc(NULL, sizeof(*q), 0);
+```
+
+Note: Casts are required in C++ but DPDK is a C project.
+
+### Zero-Length Arrays vs Variable-Length Arrays
+
+Zero-length arrays (`int arr[0]`) are a GCC extension. Use C99 flexible array members instead.
+
+```c
+/* Bad - GCC extension */
+struct msg {
+	int len;
+	char data[0];
+};
+
+/* Good - C99 flexible array member */
+struct msg {
+	int len;
+	char data[];
+};
+```
+
+### Unnecessary NULL Checks Before free()
+
+Functions like `free()`, `rte_free()`, and similar deallocation functions accept NULL pointers safely. Do not add redundant NULL checks.
+
+```c
+/* Bad - unnecessary check */
+if (ptr != NULL)
+	free(ptr);
+
+if (rte_ptr != NULL)
+	rte_free(rte_ptr);
+
+/* Good - free handles NULL */
+free(ptr);
+rte_free(rte_ptr);
+```
+
+### memset Before free() (CWE-14)
+
+Do not call `memset()` to zero memory before freeing it. The compiler may optimize away the `memset()` as a dead store (CWE-14: Compiler Removal of Code to Clear Buffers). For security-sensitive data, use `explicit_bzero()`, `rte_memset_sensitive()`, or `rte_free_sensitive()` which the compiler is not permitted to eliminate.
+
+```c
+/* Bad - compiler may eliminate memset */
+memset(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for non-sensitive data, just free */
+free(ptr);
+
+/* Good - explicit_bzero cannot be optimized away */
+explicit_bzero(secret_key, sizeof(secret_key));
+free(secret_key);
+
+/* Good - DPDK wrapper for clearing sensitive data */
+rte_memset_sensitive(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for rte_malloc'd sensitive data, combined clear+free */
+rte_free_sensitive(secret_key);
+```
+
+### Appropriate Use of rte_malloc()
+
+`rte_malloc()` allocates from hugepage memory. Use it only when required:
+
+- Memory that will be accessed by DMA (NIC descriptors, packet buffers)
+- Memory shared between primary and secondary DPDK processes
+- Memory requiring specific NUMA node placement
+
+For general allocations, use standard `malloc()` which is faster and does not consume limited hugepage resources.
+
+```c
+/* Bad - rte_malloc for ordinary data structure */
+struct config *cfg = rte_malloc(NULL, sizeof(*cfg), 0);
+
+/* Good - standard malloc for control structures */
+struct config *cfg = malloc(sizeof(*cfg));
+
+/* Good - rte_malloc for DMA-accessible memory */
+struct rte_mbuf *mbufs = rte_malloc(NULL, n * sizeof(*mbufs), RTE_CACHE_LINE_SIZE);
+```
+
+### Appropriate Use of rte_memcpy()
+
+`rte_memcpy()` is optimized for bulk data transfer in the fast path. For general use, standard `memcpy()` is preferred because:
+
+- Modern compilers optimize `memcpy()` effectively
+- `memcpy()` includes bounds checking with `_FORTIFY_SOURCE`
+- `memcpy()` handles small fixed-size copies efficiently
+
+```c
+/* Bad - rte_memcpy in control path */
+rte_memcpy(&config, &default_config, sizeof(config));
+
+/* Good - standard memcpy for control path */
+memcpy(&config, &default_config, sizeof(config));
+
+/* Good - rte_memcpy for packet data in fast path */
+rte_memcpy(rte_pktmbuf_mtod(m, void *), payload, len);
+```
+
+### Non-const Function Pointer Arrays
+
+Arrays of function pointers (ops tables, dispatch tables, callback arrays)
+should be declared `const` when their contents are fixed at compile time.
+A non-`const` function pointer array can be overwritten by bugs or exploits,
+and prevents the compiler from placing the table in read-only memory.
+
+```c
+/* Bad - mutable when it doesn't need to be */
+static rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+
+/* Good - immutable dispatch table */
+static const rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+```
+
+**Exceptions** (do NOT flag):
+- Arrays modified at runtime for CPU feature detection or capability probing
+  (e.g., selecting a burst function based on `rte_cpu_get_flag_enabled()`)
+- Arrays containing mutable state (e.g., entries that are linked into lists)
+- Arrays populated dynamically via registration APIs
+- `dev_ops` or similar structures assigned per-device at init time
+
+Only flag when the array is fully initialized at declaration with constant
+values and never modified thereafter.
+
+---
+
+## Forbidden Tokens
+
+### Functions
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `rte_panic()` | Return error codes | lib/, drivers/ |
+| `rte_exit()` | Return error codes | lib/, drivers/ |
+| `perror()` | `RTE_LOG()` with `strerror(errno)` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `printf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `fprintf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `getenv()` | `rte_kvargs_parse()` / devargs | drivers/ (allowed in EAL, examples/, app/test/) |
+
+### Atomics and Memory Barriers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `rte_atomic16/32/64_xxx()` | C11 atomics via `rte_atomic_xxx()` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence()` |
+| `__sync_xxx()` | `rte_atomic_xxx()` |
+| `__atomic_xxx()` | `rte_atomic_xxx()` |
+| `__ATOMIC_RELAXED` etc. | `rte_memory_order_xxx` |
+| `__rte_atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+
+#### Shared Variable Access: volatile vs Atomics
+
+Variables shared between threads or between a thread and a signal
+handler **must** use atomic operations. The C `volatile` keyword is
+NOT a substitute for atomics — it prevents compiler optimization
+of accesses but provides no atomicity guarantees and no memory
+ordering between threads. On some architectures, `volatile` reads
+and writes may tear on unaligned or multi-word values.
+
+DPDK provides C11 atomic wrappers that are portable across all
+supported compilers and architectures. Always use these for shared
+state.
+
+**Reading shared variables:**
+
+```c
+/* BAD - volatile provides no atomicity or ordering guarantee */
+volatile int stop_flag;
+if (stop_flag)           /* data race, compiler/CPU can reorder */
+    return;
+
+/* BAD - direct access to shared variable without atomic */
+if (shared->running)     /* undefined behavior if another thread writes */
+    process();
+
+/* GOOD - DPDK C11 atomic wrapper */
+if (rte_atomic_load_explicit(&shared->stop_flag, rte_memory_order_acquire))
+    return;
+
+/* GOOD - relaxed is fine for statistics or polling a flag where
+ * you don't need to synchronize other memory accesses */
+count = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+```
+
+**Writing shared variables:**
+
+```c
+/* BAD - volatile write */
+volatile int *flag = &shared->ready;
+*flag = 1;
+
+/* GOOD - atomic store with appropriate ordering */
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+```
+
+**Read-modify-write operations:**
+
+```c
+/* BAD - not atomic even with volatile */
+volatile uint64_t *counter = &stats->packets;
+*counter += nb_rx;       /* TOCTOU: load, add, store is 3 operations */
+
+/* GOOD - atomic add */
+rte_atomic_fetch_add_explicit(&stats->packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Forbidden Atomic APIs in New Code
+
+New code **must not** use GCC/Clang `__atomic_*` built-ins or the
+legacy DPDK `rte_smp_*mb()` barriers. These are deprecated and
+will be removed. Use the DPDK C11 atomic wrappers instead.
+
+**GCC/Clang `__atomic_*` built-ins — do not use:**
+
+```c
+/* BAD - GCC built-in, not portable, not DPDK API */
+val = __atomic_load_n(&shared->count, __ATOMIC_RELAXED);
+__atomic_store_n(&shared->flag, 1, __ATOMIC_RELEASE);
+__atomic_fetch_add(&shared->counter, 1, __ATOMIC_RELAXED);
+__atomic_compare_exchange_n(&shared->state, &expected, desired,
+    0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+__atomic_thread_fence(__ATOMIC_SEQ_CST);
+
+/* GOOD - DPDK C11 atomic wrappers */
+val = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+rte_atomic_fetch_add_explicit(&shared->counter, 1, rte_memory_order_relaxed);
+rte_atomic_compare_exchange_strong_explicit(&shared->state, &expected, desired,
+    rte_memory_order_acq_rel, rte_memory_order_acquire);
+rte_atomic_thread_fence(rte_memory_order_seq_cst);
+```
+
+Similarly, do not use `__sync_*` built-ins (`__sync_fetch_and_add`,
+`__sync_bool_compare_and_swap`, etc.) — these are the older GCC
+atomics with implicit full barriers and are even less appropriate
+than `__atomic_*`.
+
+**Legacy DPDK barriers — do not use:**
+
+```c
+/* BAD - legacy DPDK barriers, deprecated */
+rte_smp_mb();            /* full memory barrier */
+rte_smp_rmb();           /* read memory barrier */
+rte_smp_wmb();           /* write memory barrier */
+
+/* GOOD - C11 fence with explicit ordering */
+rte_atomic_thread_fence(rte_memory_order_seq_cst);   /* replaces rte_smp_mb() */
+rte_atomic_thread_fence(rte_memory_order_acquire);    /* replaces rte_smp_rmb() */
+rte_atomic_thread_fence(rte_memory_order_release);    /* replaces rte_smp_wmb() */
+
+/* BETTER - use ordering on the atomic operation itself when possible */
+val = rte_atomic_load_explicit(&shared->flag, rte_memory_order_acquire);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+The legacy `rte_atomic16/32/64_*()` type-specific functions (e.g.,
+`rte_atomic32_inc()`, `rte_atomic64_read()`) are also deprecated.
+Use `rte_atomic_fetch_add_explicit()`, `rte_atomic_load_explicit()`,
+etc. with standard C integer types.
+
+| Deprecated API | Replacement |
+|----------------|-------------|
+| `__atomic_load_n()` | `rte_atomic_load_explicit()` |
+| `__atomic_store_n()` | `rte_atomic_store_explicit()` |
+| `__atomic_fetch_add()` | `rte_atomic_fetch_add_explicit()` |
+| `__atomic_compare_exchange_n()` | `rte_atomic_compare_exchange_strong_explicit()` |
+| `__atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+| `__ATOMIC_RELAXED` | `rte_memory_order_relaxed` |
+| `__ATOMIC_ACQUIRE` | `rte_memory_order_acquire` |
+| `__ATOMIC_RELEASE` | `rte_memory_order_release` |
+| `__ATOMIC_ACQ_REL` | `rte_memory_order_acq_rel` |
+| `__ATOMIC_SEQ_CST` | `rte_memory_order_seq_cst` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence(rte_memory_order_seq_cst)` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence(rte_memory_order_acquire)` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence(rte_memory_order_release)` |
+| `rte_atomic32_inc(&v)` | `rte_atomic_fetch_add_explicit(&v, 1, rte_memory_order_relaxed)` |
+| `rte_atomic64_read(&v)` | `rte_atomic_load_explicit(&v, rte_memory_order_relaxed)` |
+
+#### Memory Ordering Guide
+
+Use the weakest ordering that is correct. Stronger ordering
+constrains hardware and compiler optimization unnecessarily.
+
+| DPDK Ordering | When to Use |
+|---------------|-------------|
+| `rte_memory_order_relaxed` | Statistics counters, polling flags where no other data depends on the value. Most common for simple counters. |
+| `rte_memory_order_acquire` | **Load** side of a flag/pointer that guards access to other shared data. Ensures subsequent reads see data published by the releasing thread. |
+| `rte_memory_order_release` | **Store** side of a flag/pointer that publishes shared data. Ensures all prior writes are visible to a thread that does an acquire load. |
+| `rte_memory_order_acq_rel` | Read-modify-write operations (e.g., `fetch_add`) that both consume and publish shared state in one operation. |
+| `rte_memory_order_seq_cst` | Rarely needed. Only when multiple independent atomic variables must be observed in a globally consistent total order. Avoid unless required. |
+
+**Common pattern — producer/consumer flag:**
+
+```c
+/* Producer thread: fill buffer, then signal ready */
+fill_buffer(buf, data, len);
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+
+/* Consumer thread: wait for flag, then read buffer */
+while (!rte_atomic_load_explicit(&shared->ready, rte_memory_order_acquire))
+    rte_pause();
+process_buffer(buf, len);  /* guaranteed to see producer's writes */
+```
+
+**Common pattern — statistics counter (no ordering needed):**
+
+```c
+rte_atomic_fetch_add_explicit(&port_stats->rx_packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Standalone Fences
+
+Prefer ordering on the atomic operation itself (acquire load,
+release store) over standalone fences. Standalone fences
+(`rte_atomic_thread_fence()`) are a blunt instrument that
+orders ALL memory accesses around the fence, not just the
+atomic variable you care about.
+
+```c
+/* Acceptable but less precise - standalone fence */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_relaxed);
+rte_atomic_thread_fence(rte_memory_order_release);
+
+/* Preferred - ordering on the operation itself */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+Standalone fences are appropriate when synchronizing multiple
+non-atomic writes (e.g., filling a structure before publishing
+a pointer to it) where annotating each write individually is
+impractical.
+
+#### When volatile Is Still Acceptable
+
+`volatile` remains correct for:
+- Memory-mapped I/O registers (hardware MMIO)
+- Variables shared with signal handlers in single-threaded contexts
+- Interaction with `setjmp`/`longjmp`
+
+`volatile` is NOT correct for:
+- Any variable accessed by multiple threads
+- Polling flags between lcores
+- Statistics counters updated from multiple threads
+- Flags set by one thread and read by another
+
+**Do NOT flag** `volatile` used for MMIO or hardware register access
+(common in drivers under `drivers/*/base/`).
+
+### Threading
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `pthread_create()` | `rte_thread_create()` |
+| `pthread_join()` | `rte_thread_join()` |
+| `pthread_detach()` | EAL thread functions |
+| `pthread_setaffinity_np()` | `rte_thread_set_affinity()` |
+| `rte_thread_set_name()` | `rte_thread_set_prefixed_name()` |
+| `rte_thread_create_control()` | `rte_thread_create_internal_control()` |
+
+### Process-Shared Synchronization
+
+When placing synchronization primitives in shared memory (memory accessible by multiple processes, such as DPDK primary/secondary processes or `mmap`'d regions), they **must** be initialized with process-shared attributes. Failure to do so causes **undefined behavior** that may appear to work in testing but fail unpredictably in production.
+
+#### pthread Mutexes in Shared Memory
+
+**This is an error** - mutex in shared memory without `PTHREAD_PROCESS_SHARED`:
+
+```c
+/* BAD - undefined behavior when used across processes */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+void init_shared(struct shared_data *shm) {
+	pthread_mutex_init(&shm->lock, NULL);  /* ERROR: missing pshared attribute */
+}
+```
+
+**Correct implementation**:
+
+```c
+/* GOOD - properly initialized for cross-process use */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+int init_shared(struct shared_data *shm) {
+	pthread_mutexattr_t attr;
+	int ret;
+
+	ret = pthread_mutexattr_init(&attr);
+	if (ret != 0)
+		return -ret;
+
+	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+	if (ret != 0) {
+		pthread_mutexattr_destroy(&attr);
+		return -ret;
+	}
+
+	ret = pthread_mutex_init(&shm->lock, &attr);
+	pthread_mutexattr_destroy(&attr);
+
+	return -ret;
+}
+```
+
+#### pthread Condition Variables in Shared Memory
+
+Condition variables also require the process-shared attribute:
+
+```c
+/* BAD - will not work correctly across processes */
+pthread_cond_init(&shm->cond, NULL);
+
+/* GOOD */
+pthread_condattr_t cattr;
+pthread_condattr_init(&cattr);
+pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+pthread_cond_init(&shm->cond, &cattr);
+pthread_condattr_destroy(&cattr);
+```
+
+#### pthread Read-Write Locks in Shared Memory
+
+```c
+/* BAD */
+pthread_rwlock_init(&shm->rwlock, NULL);
+
+/* GOOD */
+pthread_rwlockattr_t rwattr;
+pthread_rwlockattr_init(&rwattr);
+pthread_rwlockattr_setpshared(&rwattr, PTHREAD_PROCESS_SHARED);
+pthread_rwlock_init(&shm->rwlock, &rwattr);
+pthread_rwlockattr_destroy(&rwattr);
+```
+
+#### When to Flag This Issue
+
+Flag as an **Error** when ALL of the following are true:
+1. A `pthread_mutex_t`, `pthread_cond_t`, `pthread_rwlock_t`, or `pthread_barrier_t` is initialized
+2. The primitive is stored in shared memory (identified by context such as: structure in `rte_malloc`/`rte_memzone`, `mmap`'d memory, memory passed to secondary processes, or structures documented as shared)
+3. The initialization uses `NULL` attributes or attributes without `PTHREAD_PROCESS_SHARED`
+
+**Do NOT flag** when:
+- The mutex is in thread-local or process-private heap memory (`malloc`)
+- The mutex is a local/static variable not in shared memory
+- The code already uses `pthread_mutexattr_setpshared()` with `PTHREAD_PROCESS_SHARED`
+- The synchronization uses DPDK primitives (`rte_spinlock_t`, `rte_rwlock_t`) which are designed for shared memory
+
+#### Preferred Alternatives
+
+For DPDK code, prefer DPDK's own synchronization primitives which are designed for shared memory:
+
+| pthread Primitive | DPDK Alternative |
+|-------------------|------------------|
+| `pthread_mutex_t` | `rte_spinlock_t` (busy-wait) or properly initialized pthread mutex |
+| `pthread_rwlock_t` | `rte_rwlock_t` |
+| `pthread_spinlock_t` | `rte_spinlock_t` |
+
+Note: `rte_spinlock_t` and `rte_rwlock_t` work correctly in shared memory without special initialization, but they are spinning locks unsuitable for long wait times.
+
+### Compiler Built-ins and Attributes
+
+| Forbidden | Preferred | Notes |
+|-----------|-----------|-------|
+| `__attribute__` | RTE macros in `rte_common.h` | Except in `lib/eal/include/rte_common.h` |
+| `__alignof__` | C11 `alignof` | |
+| `__typeof__` | `typeof` | |
+| `__builtin_*` | EAL macros | Except in `lib/eal/` and `drivers/*/base/` |
+| `__reserved` | Different name | Reserved in Windows headers |
+| `#pragma` / `_Pragma` | Avoid | Except in `rte_common.h` |
+
+### Format Specifiers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `%lld`, `%llu`, `%llx` | `%PRId64`, `%PRIu64`, `%PRIx64` |
+
+### Headers and Build
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `#include <linux/pci_regs.h>` | `#include <rte_pci.h>` | |
+| `install_headers()` | Meson `headers` variable | meson.build |
+| `-DALLOW_EXPERIMENTAL_API` | Not in lib/drivers/app | Build flags |
+| `allow_experimental_apis` | Not in lib/drivers/app | Meson |
+| `#undef XXX` | `// XXX is not set` | config/rte_config.h |
+| Driver headers (`*_driver.h`, `*_pmd.h`) | Public API headers | app/, examples/ |
+
+### Testing
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `REGISTER_TEST_COMMAND` | `REGISTER_<suite_name>_TEST` |
+
+### Documentation
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `http://...dpdk.org` | `https://...dpdk.org` |
+| `//doc.dpdk.org/guides/...` | `:ref:` or `:doc:` Sphinx references |
+| `::  file.svg` | `::  file.*` (wildcard extension) |
+
+---
+
+## Deprecated API Usage
+
+New patches must not introduce usage of deprecated APIs, macros, or functions.
+Deprecated items are marked with `RTE_DEPRECATED` or documented in the
+deprecation notices section of the release notes.
+
+### Rules for New Code
+
+- Do not call functions marked with `RTE_DEPRECATED` or `__rte_deprecated`
+- Do not use macros that have been superseded by newer alternatives
+- Do not use data structures or enum values marked as deprecated
+- Check `doc/guides/rel_notes/deprecation.rst` for planned deprecations
+- When a deprecated API has a replacement, use the replacement
+
+### Deprecating APIs
+
+A patch may mark an API as deprecated provided:
+
+- No remaining usages exist in the current DPDK codebase
+- The deprecation is documented in the release notes
+- A migration path or replacement API is documented
+- The `RTE_DEPRECATED` macro is used to generate compiler warnings
+
+```c
+/* Marking a function as deprecated */
+__rte_deprecated
+int
+rte_old_function(void);
+
+/* With a message pointing to the replacement */
+__rte_deprecated_msg("use rte_new_function() instead")
+int
+rte_old_function(void);
+```
+
+### Common Deprecated Patterns
+
+| Deprecated | Replacement | Notes |
+|-----------|-------------|-------|
+| `rte_atomic*_t` types | C11 atomics | Use `rte_atomic_xxx()` wrappers |
+| `rte_smp_*mb()` barriers | `rte_atomic_thread_fence()` | See Atomics section |
+| `pthread_*()` in portable code | `rte_thread_*()` | See Threading section |
+
+When reviewing patches that add new code, flag any usage of deprecated APIs
+as requiring change to use the modern replacement.
+
+---
+
+## API Tag Requirements
+
+### `__rte_experimental`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_experimental
+int
+rte_new_feature(void);
+
+/* Wrong - not alone on line */
+__rte_experimental int rte_new_feature(void);
+
+/* Wrong - in .c file */
+```
+
+### `__rte_internal`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_internal
+int
+internal_function(void);
+```
+
+### Alignment Attributes
+
+`__rte_aligned`, `__rte_cache_aligned`, `__rte_cache_min_aligned` may only be used with `struct` or `union` types:
+
+```c
+/* Correct */
+struct __rte_cache_aligned my_struct {
+	/* ... */
+};
+
+/* Wrong */
+int __rte_cache_aligned my_variable;
+```
+
+### Packed Attributes
+
+- `__rte_packed_begin` must follow `struct`, `union`, or alignment attributes
+- `__rte_packed_begin` and `__rte_packed_end` must be used in pairs
+- Cannot use `__rte_packed_begin` with `enum`
+
+```c
+/* Correct */
+struct __rte_packed_begin my_packed_struct {
+	/* ... */
+} __rte_packed_end;
+
+/* Wrong - with enum */
+enum __rte_packed_begin my_enum {
+	/* ... */
+};
+```
+
+---
+
+## Code Quality Requirements
+
+### Compilation
+
+- Each commit must compile independently (for `git bisect`)
+- No forward dependencies within a patchset
+- Test with multiple targets, compilers, and options
+- Use `devtools/test-meson-builds.sh`
+
+**Note for AI reviewers**: You cannot verify compilation order or cross-patch dependencies from patch review alone. Do NOT flag patches claiming they "would fail to compile" based on symbols used in other patches in the series. Assume the patch author has ordered them correctly.
+
+### Testing
+
+- Add tests to `app/test` unit test framework
+- New API functions must be used in `/app` test directory
+- New device APIs require at least one driver implementation
+
+#### Functional Test Infrastructure
+
+Standalone functional tests should use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure for consistency and proper integration with the DPDK test framework.
+
+```c
+#include <rte_test.h>
+
+static int
+test_feature_basic(void)
+{
+	int ret;
+
+	ret = rte_feature_init();
+	TEST_ASSERT_SUCCESS(ret, "Failed to initialize feature");
+
+	ret = rte_feature_operation();
+	TEST_ASSERT_EQUAL(ret, 0, "Operation returned unexpected value");
+
+	TEST_ASSERT_NOT_NULL(rte_feature_get_ptr(),
+		"Feature pointer should not be NULL");
+
+	return TEST_SUCCESS;
+}
+
+static struct unit_test_suite feature_testsuite = {
+	.suite_name = "feature_autotest",
+	.setup = test_feature_setup,
+	.teardown = test_feature_teardown,
+	.unit_test_cases = {
+		TEST_CASE(test_feature_basic),
+		TEST_CASE(test_feature_advanced),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_feature(void)
+{
+	return unit_test_suite_runner(&feature_testsuite);
+}
+
+REGISTER_FAST_TEST(feature_autotest, NOHUGE_OK, ASAN_OK, test_feature);
+```
+
+The `REGISTER_FAST_TEST` macro parameters are:
+- Test name (e.g., `feature_autotest`)
+- `NOHUGE_OK` or `HUGEPAGES_REQUIRED` - whether test can run without hugepages
+- `ASAN_OK` or `ASAN_FAILS` - whether test is compatible with Address Sanitizer
+- Test function name
+
+Common `TEST_ASSERT` macros:
+- `TEST_ASSERT(cond, msg, ...)` - Assert condition is true
+- `TEST_ASSERT_SUCCESS(val, msg, ...)` - Assert value equals 0
+- `TEST_ASSERT_FAIL(val, msg, ...)` - Assert value is non-zero
+- `TEST_ASSERT_EQUAL(a, b, msg, ...)` - Assert two values are equal
+- `TEST_ASSERT_NOT_EQUAL(a, b, msg, ...)` - Assert two values differ
+- `TEST_ASSERT_NULL(val, msg, ...)` - Assert value is NULL
+- `TEST_ASSERT_NOT_NULL(val, msg, ...)` - Assert value is not NULL
+
+### Documentation
+
+- Add Doxygen comments for public APIs
+- Update release notes in `doc/guides/rel_notes/` for important changes
+- Code and documentation must be updated atomically in same patch
+- Only update the **current release** notes file
+- Documentation must match the code
+- PMD features must match the features matrix in `doc/guides/nics/features/`
+- Documentation must match device operations (see `doc/guides/nics/features.rst` for the mapping between features, `eth_dev_ops`, and related APIs)
+- Release notes are NOT required for:
+  - Test-only changes (unit tests, functional tests)
+  - Internal APIs and helper functions (not exported to applications)
+  - Internal implementation changes that don't affect public API
+
+### RST Documentation Style
+
+When reviewing `.rst` documentation files, prefer **definition lists**
+over simple bullet lists where each item has a term and a description.
+Definition lists produce better-structured HTML/PDF output and are
+easier to scan.
+
+**When to suggest a definition list:**
+- A bullet list where each item starts with a bold or emphasized term
+  followed by a dash, colon, or long explanation
+- Lists of options, parameters, configuration values, or features
+  where each entry has a name and a description
+- Glossary-style enumerations
+
+**When a simple list is fine (do NOT flag):**
+- Short lists of items without descriptions (e.g., file names, steps)
+- Lists where items are single phrases or sentences with no term/definition structure
+- Enumerated steps in a procedure
+
+**RST definition list syntax:**
+
+```rst
+term 1
+   Description of term 1.
+
+term 2
+   Description of term 2.
+   Can span multiple lines.
+```
+
+**Example — flag this pattern:**
+
+```rst
+* **error** - Fail with error (default)
+* **truncate** - Truncate content to fit token limit
+* **summary** - Request high-level summary review
+```
+
+**Suggest rewriting as:**
+
+```rst
+error
+   Fail with error (default).
+
+truncate
+   Truncate content to fit token limit.
+
+summary
+   Request high-level summary review.
+```
+
+This is a **Warning**-level suggestion, not an Error. Do not flag it
+when the existing list structure is appropriate (see "when a simple
+list is fine" above).
+
+### API and Driver Changes
+
+- New APIs must be marked as `__rte_experimental`
+- New APIs must have hooks in `app/testpmd` and tests in the functional test suite
+- Changes to existing APIs require release notes
+- New drivers or subsystems must have release notes
+- Internal APIs (used only within DPDK, not exported to applications) do NOT require release notes
+
+### ABI Compatibility and Symbol Exports
+
+**IMPORTANT**: DPDK uses automatic symbol map generation. Do **NOT** recommend
+manually editing `version.map` files - they are auto-generated from source code
+annotations.
+
+#### Symbol Export Macros
+
+New public functions must be annotated with export macros (defined in
+`rte_export.h`). Place the macro on the line immediately before the function
+definition in the `.c` file:
+
+```c
+/* For stable ABI symbols */
+RTE_EXPORT_SYMBOL(rte_foo_create)
+int
+rte_foo_create(struct rte_foo_config *config)
+{
+    /* ... */
+}
+
+/* For experimental symbols (include version when first added) */
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_foo_new_feature, 25.03)
+__rte_experimental
+int
+rte_foo_new_feature(void)
+{
+    /* ... */
+}
+
+/* For internal symbols (shared between DPDK components only) */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_foo_internal_helper)
+int
+rte_foo_internal_helper(void)
+{
+    /* ... */
+}
+```
+
+#### Symbol Export Rules
+
+- `RTE_EXPORT_SYMBOL` - Use for stable ABI functions
+- `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, ver)` - Use for new experimental APIs
+  (version is the DPDK release, e.g., `25.03`)
+- `RTE_EXPORT_INTERNAL_SYMBOL` - Use for functions shared between DPDK libs/drivers
+  but not part of public API
+- Export macros go in `.c` files, not headers
+- The build system generates linker version maps automatically
+
+#### What NOT to Review
+
+- Do **NOT** flag missing `version.map` updates - maps are auto-generated
+- Do **NOT** suggest adding symbols to `lib/*/version.map` files
+
+#### ABI Versioning for Changed Functions
+
+When changing the signature of an existing stable function, use versioning macros
+from `rte_function_versioning.h`:
+
+- `RTE_VERSION_SYMBOL` - Create versioned symbol for backward compatibility
+- `RTE_DEFAULT_SYMBOL` - Mark the new default version
+
+Follow ABI policy and versioning guidelines in the contributor documentation.
+Enable ABI checks with `DPDK_ABI_REF_VERSION` environment variable.
+
+---
+
+## LTS (Long Term Stable) Release Review
+
+LTS releases are DPDK versions ending in `.11` (e.g., 23.11, 22.11,
+21.11, 20.11, 19.11). When reviewing patches targeting an LTS branch,
+apply stricter criteria:
+
+### LTS-Specific Rules
+
+- **Only bug fixes allowed** -- no new features
+- **No new APIs** (experimental or stable)
+- **ABI must remain unchanged** -- no symbol additions, removals,
+  or signature changes
+- Backported fixes should reference the original commit with a
+  `Fixes:` tag
+- Copyright years should reflect when the code was originally
+  written
+- Be conservative: reject changes that are not clearly bug fixes
+
+### What to Flag on LTS Branches
+
+**Error:**
+- New feature code (new functions, new driver capabilities)
+- New experimental or stable API additions
+- ABI changes (new or removed symbols, changed function signatures)
+- Changes that add new configuration options or parameters
+
+**Warning:**
+- Large refactoring that goes beyond what is needed for a fix
+- Missing `Fixes:` tag on a backported bug fix
+- Missing `Cc: stable@dpdk.org`
+
+### When LTS Rules Apply
+
+LTS rules apply when the reviewer is told the target release is an
+LTS version (via the `--release` option or equivalent). If no
+release is specified, assume the patch targets the main development
+branch where new features and APIs are allowed.
+
+---
+
+## Patch Validation Checklist
+
+### Commit Message and License
+
+Checked by `devtools/checkpatches.sh` -- not duplicated here.
+
+### Code Style
+
+- [ ] Lines <=100 characters
+- [ ] Hard tabs for indentation, spaces for alignment
+- [ ] No trailing whitespace
+- [ ] Proper include order
+- [ ] Header guards present
+- [ ] `rte_`/`RTE_` prefix on external symbols
+- [ ] Driver/library global variables use unique prefixes (e.g., `virtio_`, `mlx5_`)
+- [ ] No prohibited terminology
+- [ ] Proper brace style
+- [ ] Function return type on own line
+- [ ] Explicit comparisons: `== NULL`, `== 0`, `!= NULL`, `!= 0`
+- [ ] No forbidden tokens (see table above)
+- [ ] No unnecessary code patterns (see section above)
+- [ ] No usage of deprecated APIs, macros, or functions
+- [ ] Process-shared primitives in shared memory use `PTHREAD_PROCESS_SHARED`
+- [ ] `mmap()` return checked against `MAP_FAILED`, not `NULL`
+- [ ] Statistics use `+=` not `=` for accumulation
+- [ ] Integer multiplies widened before operation when result is 64-bit
+- [ ] Descriptor chain traversals bounded by ring size or loop counter
+- [ ] 64-bit bitmasks use `1ULL <<` or `RTE_BIT64()`, not `1 <<`
+- [ ] Left shifts of `uint8_t`/`uint16_t` cast to unsigned target width before shift when result is 64-bit
+- [ ] No unconditional variable overwrites before read
+- [ ] Nested loops use distinct counter variables
+- [ ] No `memcpy`/`memcmp` with identical source and destination pointers
+- [ ] `rte_mbuf_raw_free_bulk()` not used on mixed-pool mbuf arrays (Tx paths, ring dequeue, error paths)
+- [ ] MTU not confused with frame length (MTU = L3 payload, frame = MTU + L2 overhead)
+- [ ] PMDs read `dev->data->mtu` after configure, not `dev_conf.rxmode.mtu`
+- [ ] Ethernet overhead not hardcoded -- derived from device capabilities
+- [ ] Scatter Rx enabled or error returned when frame length exceeds single mbuf data size
+- [ ] `mtu_set` allows large MTU when scatter Rx is active; re-selects Rx burst function
+- [ ] Rx queue setup selects scattered Rx function when frame length exceeds mbuf
+- [ ] Static function pointer arrays declared `const` when contents are compile-time fixed
+- [ ] `bool` used for pure true/false variables, parameters, and predicate return types
+- [ ] Shared variables use `rte_atomic_*_explicit()`, not `volatile` or bare access
+- [ ] No `__atomic_*()` GCC built-ins or `__ATOMIC_*` ordering constants (use `rte_atomic_*_explicit()` and `rte_memory_order_*`)
+- [ ] No `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` (use `rte_atomic_thread_fence()`)
+- [ ] Memory ordering is the weakest correct choice (`relaxed` for counters, `acquire`/`release` for publish/consume)
+- [ ] Sensitive data cleared with `explicit_bzero()`/`rte_free_sensitive()`, not `memset()`
+
+### API Tags
+
+- [ ] `__rte_experimental` alone on line, only in headers
+- [ ] `__rte_internal` alone on line, only in headers
+- [ ] Alignment attributes only on struct/union
+- [ ] Packed attributes properly paired
+- [ ] New public functions have `RTE_EXPORT_*` macro in `.c` file
+- [ ] Experimental functions use `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, version)`
+
+### Structure
+
+- [ ] Each commit compiles independently
+- [ ] Code and docs updated together
+- [ ] Documentation matches code behavior
+- [ ] RST docs use definition lists for term/description patterns
+- [ ] PMD features match `doc/guides/nics/features/` matrix
+- [ ] Device operations match documentation (per `features.rst` mappings)
+- [ ] Tests added/updated as needed
+- [ ] Functional tests use TEST_ASSERT macros and unit_test_suite_runner
+- [ ] New APIs marked as `__rte_experimental`
+- [ ] New APIs have testpmd hooks and functional tests
+- [ ] Current release notes updated for significant changes
+- [ ] Release notes updated for API changes
+- [ ] Release notes updated for new drivers or subsystems
+
+---
+
+## Meson Build Files
+
+### Style Requirements
+
+- 4-space indentation (no tabs)
+- Line continuations double-indented
+- Lists alphabetically ordered
+- Short lists (<=3 items): single line, no trailing comma
+- Long lists: one item per line, trailing comma on last item
+- No strict line length limit for meson files; lines under 100 characters are acceptable
+
+```python
+# Short list
+sources = files('file1.c', 'file2.c')
+
+# Long list
+headers = files(
+	'header1.h',
+	'header2.h',
+	'header3.h',
+)
+```
+
+---
+
+## Python Code
+
+- Must comply with formatting standards
+- Use **`black`** for code formatting validation
+- Line length acceptable up to 100 characters
+
+---
+
+## Validation Tools
+
+Run these before submitting:
+
+```bash
+# Check commit messages
+devtools/check-git-log.sh -n1
+
+# Check patch format and forbidden tokens
+devtools/checkpatches.sh -n1
+
+# Check maintainers coverage
+devtools/check-maintainers.sh
+
+# Build validation
+devtools/test-meson-builds.sh
+
+# Find maintainers for your patch
+devtools/get-maintainer.sh <patch-file>
+```
+
+---
+
+## Severity Levels for AI Review
+
+**Error** (must fix):
+
+*Correctness bugs (highest value findings):*
+- Use-after-free
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference on reachable code path
+- Buffer overflow or out-of-bounds access
+- Missing error check on a function that can fail, leading to undefined behavior
+- Race condition on shared mutable state without synchronization
+- `volatile` used instead of atomics for inter-thread shared variables
+- `__atomic_*()` GCC built-ins in new code (must use `rte_atomic_*_explicit()`)
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` in new code (must use `rte_atomic_thread_fence()`)
+- Error path that skips necessary cleanup
+- `mmap()` return value checked against NULL instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=` (overwrite vs increment)
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied indices
+- `1 << n` used for 64-bit bitmask (undefined behavior if n >= 32)
+- Left shift of `uint8_t`/`uint16_t` used in 64-bit context without widening cast (sign extension)
+- Variable assigned then unconditionally overwritten before read
+- Same variable used as counter in nested loops
+- `memcpy`/`memcmp` with same pointer as both arguments (UB or no-op logic error)
+- `rte_mbuf_raw_free_bulk()` on mbuf array where mbufs may come from different pools (Tx burst, ring dequeue)
+- MTU used where frame length is needed or vice versa (off by L2 overhead)
+- `dev_conf.rxmode.mtu` read after configure instead of `dev->data->mtu` (stale value)
+- MTU accepted without scatter Rx when frame size exceeds single mbuf capacity (silent truncation/drop)
+- `mtu_set` rejects valid MTU when scatter Rx is already enabled
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size comparison
+
+*Process and format errors:*
+- Forbidden tokens in code
+- `__rte_experimental`/`__rte_internal` in .c files or not alone on line
+- Compilation failures
+- ABI breaks without proper versioning
+- pthread mutex/cond/rwlock in shared memory without `PTHREAD_PROCESS_SHARED`
+
+*API design errors (new libraries only):*
+- Ops/callback struct with 20+ function pointers in an installed header
+- Callback struct members with no Doxygen documentation
+- Void-returning callbacks for failable operations (errors silently swallowed)
+
+**Warning** (should fix):
+- Missing Cc: stable@dpdk.org for fixes
+- Documentation gaps
+- Documentation does not match code behavior
+- PMD features missing from `doc/guides/nics/features/` matrix
+- Device operations not documented per `features.rst` mappings
+- Missing tests
+- Functional tests not using TEST_ASSERT macros or unit_test_suite_runner
+- New API not marked as `__rte_experimental`
+- New API without testpmd hooks or functional tests
+- New public function missing `RTE_EXPORT_*` macro
+- API changes without release notes
+- New drivers or subsystems without release notes
+- Implicit comparisons (`!ptr` instead of `ptr == NULL`)
+- Unnecessary variable initialization
+- Unnecessary casts of `void *`
+- Unnecessary NULL checks before free
+- Inappropriate use of `rte_malloc()` or `rte_memcpy()`
+- Use of `perror()`, `printf()`, `fprintf()` in libraries or drivers (allowed in examples and test code)
+- Driver/library global variables without unique prefixes (static linking clash risk)
+- Usage of deprecated APIs, macros, or functions in new code
+- RST documentation using bullet lists where definition lists would be more appropriate
+- Ops/callback struct with >5 function pointers in an installed header (ABI risk)
+- New API using fixed enum+union where TLV pattern would be more extensible
+- Installed header labeled "private" or "internal" in meson.build
+- New library using global singleton instead of handle-based API
+- Static function pointer array not declared `const` when contents are compile-time constant
+- `int` used instead of `bool` for variables or return values that are purely true/false
+- `rte_memory_order_seq_cst` used where weaker ordering (`relaxed`, `acquire`/`release`) suffices
+- Standalone `rte_atomic_thread_fence()` where ordering on the atomic operation itself would be clearer
+- `getenv()` used in a driver or library for runtime configuration instead of devargs
+- Hardcoded Ethernet overhead constant instead of per-device overhead calculation
+- PMD does not advertise `RTE_ETH_RX_OFFLOAD_SCATTER` in `rx_offload_capa` but hardware supports multi-segment Rx
+- PMD `dev_info` reports `max_rx_pktlen` or `max_mtu` inconsistent with each other or with the Ethernet overhead
+- `mtu_set` callback does not re-select the Rx burst function after changing MTU
+
+**Do NOT flag** (common false positives):
+- Missing `version.map` updates (maps are auto-generated from `RTE_EXPORT_*` macros)
+- Suggesting manual edits to any `version.map` file
+- SPDX/copyright format, copyright years, copyright holders (not subject to AI review)
+- Commit message formatting (subject length, punctuation, tag order, case-sensitive terms) -- checked by checkpatch
+- Meson file lines under 100 characters
+- Comparisons using `== 0`, `!= 0`, `== NULL`, `!= NULL` as "implicit" (these ARE explicit)
+- Comparisons wrapped in `likely()` or `unlikely()` macros - these are still explicit if using == or !=
+- Anything you determine is correct (do not mention non-issues or say "No issue here")
+- `REGISTER_FAST_TEST` using `NOHUGE_OK`/`ASAN_OK` macros (this is the correct current format)
+- Missing release notes for test-only changes (unit tests do not require release notes)
+- Missing release notes for internal APIs or helper functions (only public APIs need release notes)
+- Any item you later correct with "(Correction: ...)" or "actually acceptable" - just omit it
+- Vague concerns ("should be verified", "should be checked") - if you're not sure it's wrong, don't flag it
+- Items where you say "which is correct" or "this is correct" - if it's correct, don't mention it at all
+- Items where you conclude "no issue here" or "this is actually correct" - omit these entirely
+- Clean patches in a series - do not include a patch just to say "no issues" or describe what it does
+- Cross-patch compilation dependencies - you cannot determine patch ordering correctness from review
+- Claims that a symbol "was removed in patch N" causing issues in patch M - assume author ordered correctly
+- Any speculation about whether patches will compile when applied in sequence
+- Mutexes/locks in process-private memory (standard `malloc`, stack, static non-shared) - these don't need `PTHREAD_PROCESS_SHARED`
+- Use of `rte_spinlock_t` or `rte_rwlock_t` in shared memory (these work correctly without special init)
+- `volatile` used for MMIO/hardware register access in drivers (this is correct usage)
+- Left shift of `uint8_t`/`uint16_t` where the result is stored in a `uint32_t` or narrower variable and not used in pointer arithmetic or 64-bit context (sign extension cannot occur)
+- `getenv()` used in EAL, examples, app/test, or build/config scripts (only flag in drivers/ and lib/)
+- Reading `rxmode.mtu` inside `rte_eth_dev_configure()` implementation (that is where the user request is consumed)
+- `=` assignment to MTU or frame length fields during initial setup (only flag stale reads of `rxmode.mtu` outside configure)
+- PMDs that auto-enable scatter when MTU exceeds mbuf size (this is the correct pattern)
+- Hardcoded `RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN` as overhead when the PMD does not support VLAN and device info is consistent
+- Tagged frames exceeding 1518 bytes at standard MTU — a single-tagged frame of 1522 bytes is valid at MTU 1500 (the outer VLAN header is L2 overhead, not payload). Note: inner VLAN tags in QinQ *do* consume MTU; see the MTU section for details.
+
+**Info** (consider):
+- Minor style preferences
+- Optimization suggestions
+- Alternative approaches
+
+---
+
+# Response Format
+
+When you identify an issue:
+1. **State the problem** (1 sentence)
+2. **Why it matters** (1 sentence, only if not obvious)
+3. **Suggested fix** (code snippet or specific action)
+
+Example:
+This could panic if the string is NULL.
+
+---
+
+## FINAL CHECK BEFORE SUBMITTING REVIEW
+
+Before outputting your review, do two separate passes:
+
+### Pass 1: Verify correctness bugs are included
+
+Ask: "Did I trace every error path for resource leaks? Did I check
+for use-after-free? Did I verify error codes are propagated?"
+
+If you identified a potential correctness bug but talked yourself
+out of it, **add it back**. It is better to report a possible bug
+than to miss a real one.
+
+### Pass 2: Remove style/process false positives
+
+For EACH style/process item, ask: "Did I conclude this is actually
+fine/correct/acceptable/no issue?"
+
+If YES, DELETE THAT ITEM. It should not be in your output.
+
+An item that says "X is wrong... actually this is correct" is a
+FALSE POSITIVE and must be removed. This applies to style, format,
+and process items only.
+
+**If your Errors section would be empty after this check, that's
+fine -- it means the patches are good.**
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v11 2/6] devtools: add multi-provider AI patch review script
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
@ 2026-03-27 15:41     ` Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-27 15:41 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This is an AI generated script to review DPDK patches against
the AGENTS.md coding guidelines using AI language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script reads a patch file and the AGENTS.md guidelines, then
submits them to the selected AI provider for review. Results are
organized by severity level (Error, Warning, Info) as defined in
the guidelines.

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large patches
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/analyze-patch.py 0001-net-ixgbe-fix-something.patch
  ./devtools/analyze-patch.py -p xai my-patch.patch
  ./devtools/analyze-patch.py -l  # list providers

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/analyze-patch.py | 1348 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1348 insertions(+)
 create mode 100755 devtools/analyze-patch.py

diff --git a/devtools/analyze-patch.py b/devtools/analyze-patch.py
new file mode 100755
index 0000000000..4a2950d6a4
--- /dev/null
+++ b/devtools/analyze-patch.py
@@ -0,0 +1,1348 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Analyze DPDK patches using AI providers.
+
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from datetime import date
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any, Iterator
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Large file handling modes
+LARGE_FILE_MODES = ["error", "truncate", "chunk", "commits-only", "summary"]
+
+# Approximate tokens per character (conservative estimate for code)
+CHARS_PER_TOKEN = 3.5
+
+# Default token limits by provider (leaving room for system prompt and response)
+PROVIDER_INPUT_LIMITS = {
+    "anthropic": 180000,  # 200K context, reserve for system/response
+    "openai": 900000,  # GPT-4.1 has 1M context
+    "xai": 1800000,  # Grok 4.1 Fast has 2M context
+    "google": 900000,  # Gemini 3 Flash has 1M context
+}
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# LTS releases: any DPDK release with minor version .11
+# (e.g., 19.11, 20.11, 21.11, 22.11, 23.11, 24.11, 25.11, ...)
+
+SYSTEM_PROMPT_BASE = """\
+You are an expert DPDK code reviewer. Analyze patches for compliance with \
+DPDK coding standards and contribution guidelines. Provide clear, actionable \
+feedback organized by severity (Error, Warning, Info) as defined in the \
+guidelines."""
+
+LTS_RULES = """
+LTS (Long Term Stable) branch rules apply:
+- Only bug fixes allowed, no new features
+- No new APIs (experimental or stable)
+- ABI must remain unchanged
+- Backported fixes should reference the original commit with Fixes: tag
+- Copyright years should reflect when the code was originally written
+- Be conservative: reject changes that aren't clearly bug fixes"""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """Provide your review in plain text format.""",
+    "markdown": """Provide your review in Markdown format with:
+- Headers (##) for each severity level (Errors, Warnings, Info)
+- Bullet points for individual issues
+- Code blocks (```) for code references
+- Bold (**) for emphasis on key points""",
+    "html": """Provide your review in HTML format with:
+- <h2> tags for each severity level (Errors, Warnings, Info)
+- <ul>/<li> for individual issues
+- <pre><code> for code references
+- <strong> for emphasis on key points
+- Use appropriate semantic HTML tags
+- Do NOT include <html>, <head>, or <body> tags - just the content""",
+    "json": """Provide your review in JSON format with this structure:
+{
+  "summary": "Brief one-line summary of the review",
+  "errors": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "warnings": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "info": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "passed_checks": ["list of checks that passed"],
+  "overall_status": "PASS|WARN|FAIL"
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """Please review the following DPDK patch file '{patch_name}' \
+against the AGENTS.md guidelines. Focus on:
+
+1. Correctness bugs (resource leaks, use-after-free, race conditions, etc.)
+2. C coding style (forbidden tokens, implicit comparisons, unnecessary patterns)
+3. API and documentation requirements
+4. Any other guideline violations
+
+Note: commit message formatting and SPDX/copyright compliance are checked \
+by checkpatches.sh and should NOT be flagged here.
+
+{format_instruction}
+
+--- PATCH CONTENT ---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def is_lts_release(release: str | None) -> bool:
+    """Check if a release is an LTS release.
+
+    Per DPDK project guidelines, any release with minor version .11
+    is an LTS release (e.g., 19.11, 21.11, 23.11, 24.11, 25.11).
+    """
+    if not release:
+        return False
+    # Check for explicit -lts suffix
+    if "-lts" in release.lower():
+        return True
+    # Extract base version (e.g., "23.11" from "23.11.1" or "23.11-rc1")
+    version = release.split("-")[0]
+    parts = version.split(".")
+    if len(parts) >= 2:
+        try:
+            minor = int(parts[1])
+            return minor == 11
+        except ValueError:
+            pass
+    return False
+
+
+def estimate_tokens(text: str) -> int:
+    """Estimate token count from text length."""
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def split_mbox_patches(content: str) -> list[str]:
+    """Split an mbox file into individual patches."""
+    patches = []
+    current_patch = []
+    in_patch = False
+
+    for line in content.split("\n"):
+        # Detect start of new message in mbox format
+        if line.startswith("From ") and (
+            " Mon " in line
+            or " Tue " in line
+            or " Wed " in line
+            or " Thu " in line
+            or " Fri " in line
+            or " Sat " in line
+            or " Sun " in line
+        ):
+            if current_patch:
+                patches.append("\n".join(current_patch))
+            current_patch = [line]
+            in_patch = True
+        elif in_patch:
+            current_patch.append(line)
+
+    # Don't forget the last patch
+    if current_patch:
+        patches.append("\n".join(current_patch))
+
+    return patches if patches else [content]
+
+
+def extract_commit_messages(content: str) -> str:
+    """Extract only commit messages from patch content."""
+    patches = split_mbox_patches(content)
+    messages = []
+
+    for patch in patches:
+        lines = patch.split("\n")
+        msg_lines = []
+        in_headers = True
+        in_body = False
+        found_subject = False
+
+        for line in lines:
+            # Collect headers we care about
+            if in_headers:
+                if line.startswith("Subject:"):
+                    msg_lines.append(line)
+                    found_subject = True
+                elif line.startswith(("From:", "Date:")):
+                    msg_lines.append(line)
+                elif line.startswith((" ", "\t")) and found_subject:
+                    # Subject continuation
+                    msg_lines.append(line)
+                elif line == "":
+                    if found_subject:
+                        in_headers = False
+                        in_body = True
+                        msg_lines.append("")
+            elif in_body:
+                # Stop at the diff
+                if line.startswith("---") and not line.startswith("----"):
+                    break
+                if line.startswith("diff --git"):
+                    break
+                msg_lines.append(line)
+
+        if msg_lines:
+            messages.append("\n".join(msg_lines))
+
+    return "\n\n---\n\n".join(messages)
+
+
+def truncate_content(
+    content: str, max_tokens: float, provider: str
+) -> tuple[str, bool]:
+    """Truncate content to fit within token limit."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN)
+
+    if len(content) <= max_chars:
+        return content, False
+
+    # Try to truncate at a reasonable boundary
+    truncated = content[:max_chars]
+
+    # Find last complete diff hunk or patch boundary
+    last_diff = truncated.rfind("\ndiff --git")
+    last_patch = truncated.rfind("\nFrom ")
+
+    if last_diff > max_chars * 0.5:
+        truncated = truncated[:last_diff]
+    elif last_patch > max_chars * 0.5:
+        truncated = truncated[:last_patch]
+
+    truncated += "\n\n[... Content truncated due to size limits ...]\n"
+    return truncated, True
+
+
+def chunk_content(
+    content: str, max_tokens: int, provider: str
+) -> Iterator[tuple[str, int, int]]:
+    """Split content into chunks that fit within token limit.
+
+    Yields tuples of (chunk_content, chunk_number, total_chunks).
+    """
+    patches = split_mbox_patches(content)
+
+    if len(patches) == 1:
+        # Single large patch - split by diff sections
+        yield from chunk_single_patch(content, max_tokens)
+        return
+
+    # Multiple patches - group them to fit within limits
+    chunks = []
+    current_chunk = []
+    current_size = 0
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)  # 90% to leave margin
+
+    for patch in patches:
+        patch_size = len(patch)
+        if current_size + patch_size > max_chars and current_chunk:
+            chunks.append("\n".join(current_chunk))
+            current_chunk = []
+            current_size = 0
+
+        if patch_size > max_chars:
+            # Single patch too large, truncate it
+            if current_chunk:
+                chunks.append("\n".join(current_chunk))
+                current_chunk = []
+                current_size = 0
+            truncated, _ = truncate_content(patch, max_tokens * 0.9, provider)
+            chunks.append(truncated)
+        else:
+            current_chunk.append(patch)
+            current_size += patch_size
+
+    if current_chunk:
+        chunks.append("\n".join(current_chunk))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def chunk_single_patch(content: str, max_tokens: int) -> Iterator[tuple[str, int, int]]:
+    """Split a single large patch by diff sections."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)
+
+    # Extract header (everything before first diff)
+    first_diff = content.find("\ndiff --git")
+    if first_diff == -1:
+        # No diff sections, just truncate
+        truncated, _ = truncate_content(content, max_tokens * 0.9, "anthropic")
+        yield truncated, 1, 1
+        return
+
+    header = content[: first_diff + 1]
+    diff_content = content[first_diff + 1 :]
+
+    # Split by diff sections
+    diffs = []
+    current_diff = []
+    for line in diff_content.split("\n"):
+        if line.startswith("diff --git") and current_diff:
+            diffs.append("\n".join(current_diff))
+            current_diff = []
+        current_diff.append(line)
+    if current_diff:
+        diffs.append("\n".join(current_diff))
+
+    # Group diffs into chunks
+    chunks = []
+    current_chunk_diffs = []
+    current_size = len(header)
+
+    for diff in diffs:
+        diff_size = len(diff)
+        if current_size + diff_size > max_chars and current_chunk_diffs:
+            chunks.append(header + "\n".join(current_chunk_diffs))
+            current_chunk_diffs = []
+            current_size = len(header)
+
+        if diff_size + len(header) > max_chars:
+            # Single diff too large
+            if current_chunk_diffs:
+                chunks.append(header + "\n".join(current_chunk_diffs))
+                current_chunk_diffs = []
+            truncated_diff = diff[: max_chars - len(header) - 100]
+            truncated_diff += "\n[... diff truncated ...]\n"
+            chunks.append(header + truncated_diff)
+            current_size = len(header)
+        else:
+            current_chunk_diffs.append(diff)
+            current_size += diff_size
+
+    if current_chunk_diffs:
+        chunks.append(header + "\n".join(current_chunk_diffs))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def get_summary_prompt() -> str:
+    """Get prompt modifications for summary mode."""
+    return """
+NOTE: This is a LARGE patch series. Provide a HIGH-LEVEL summary review only:
+- Focus on overall architecture and design concerns
+- Check commit message formatting across the series
+- Identify any obvious policy violations
+- Do NOT attempt detailed line-by-line code review
+- Summarize the scope and purpose of the changes
+"""
+
+
+def format_combined_reviews(
+    reviews: list[tuple[str, str]], output_format: str, patch_name: str
+) -> str:
+    """Combine multiple chunk/patch reviews into a single output."""
+    if output_format == "json":
+        combined = {
+            "patch_file": patch_name,
+            "sections": [
+                {"label": label, "review": review} for label, review in reviews
+            ],
+        }
+        return json.dumps(combined, indent=2)
+    elif output_format == "html":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"<h2>{label}</h2>\n{review}")
+        return "\n<hr>\n".join(sections)
+    elif output_format == "markdown":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"## {label}\n\n{review}")
+        return "\n\n---\n\n".join(sections)
+    else:  # text
+        sections = []
+        for label, review in reviews:
+            sections.append(f"=== {label} ===\n\n{review}")
+        return "\n\n" + "=" * 60 + "\n\n".join(sections)
+
+
+def build_system_prompt(review_date: str, release: str | None) -> str:
+    """Build system prompt with date and release context."""
+    prompt = SYSTEM_PROMPT_BASE
+    prompt += f"\n\nCurrent date: {review_date}."
+
+    if release:
+        prompt += f"\nTarget DPDK release: {release}."
+        if is_lts_release(release):
+            prompt += LTS_RULES
+        else:
+            prompt += "\nThis is a main branch or standard release."
+            prompt += "\nNew features and experimental APIs are allowed."
+
+    return prompt
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": system_prompt},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": system_prompt},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": system_prompt}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + patch_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+    verbose: bool = False,
+) -> str:
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: {usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: {usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def get_last_message_id(patch_content: str) -> str | None:
+    """Extract Message-ID from the last patch in an mbox."""
+    msg_ids = re.findall(
+        r"^Message-I[Dd]:\s*(.+)$", patch_content, re.MULTILINE | re.IGNORECASE
+    )
+    if msg_ids:
+        msg_id = msg_ids[-1].strip()
+        # Normalize: remove < > and add them back
+        msg_id = msg_id.strip("<>")
+        return f"<{msg_id}>"
+    return None
+
+
+def get_last_subject(patch_content: str) -> str | None:
+    """Extract subject from the last patch in an mbox."""
+    # Find all Subject lines with potential continuations
+    subjects = []
+    lines = patch_content.split("\n")
+    i = 0
+    while i < len(lines):
+        if lines[i].lower().startswith("subject:"):
+            subject = lines[i][8:].strip()
+            i += 1
+            # Handle continuation lines
+            while i < len(lines) and lines[i].startswith((" ", "\t")):
+                subject += lines[i].strip()
+                i += 1
+            subjects.append(subject)
+        else:
+            i += 1
+    return subjects[-1] if subjects else None
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+) -> bool:
+    """Send review email using git send-email, sendmail, or msmtp."""
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    email_text = msg.as_string()
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(email_text, file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Write to temp file for git send-email
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".eml", delete=False) as f:
+        f.write(email_text)
+        temp_file = f.name
+
+    try:
+        # Try git send-email first
+        if get_git_config("sendemail.smtpserver"):
+            # Build command with all arguments
+            flat_cmd = ["git", "send-email", "--confirm=never", "--quiet"]
+            for addr in to_addrs:
+                flat_cmd.extend(["--to", addr])
+            for addr in cc_addrs:
+                flat_cmd.extend(["--cc", addr])
+            if from_addr:
+                flat_cmd.extend(["--from", from_addr])
+            if in_reply_to:
+                flat_cmd.extend(["--in-reply-to", in_reply_to])
+            flat_cmd.append(temp_file)
+
+            try:
+                subprocess.run(flat_cmd, check=True, capture_output=True)
+                print("Email sent via git send-email", file=sys.stderr)
+                return True
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+
+        # Try sendmail
+        try:
+            subprocess.run(
+                ["sendmail", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via sendmail", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        # Try msmtp
+        try:
+            subprocess.run(
+                ["msmtp", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via msmtp", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        error("Could not send email. Configure git send-email, sendmail, or msmtp.")
+
+    finally:
+        os.unlink(temp_file)
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Analyze DPDK patches using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s patch.patch                    # Review with default settings
+    %(prog)s -p openai my-patch.patch       # Use OpenAI ChatGPT
+    %(prog)s -f markdown patch.patch        # Output as Markdown
+    %(prog)s -f json -o review.json patch.patch  # Save JSON to file
+    %(prog)s -f html -o review.html patch.patch  # Save HTML to file
+    %(prog)s -r 24.11 patch.patch           # Review for specific release
+    %(prog)s -r 24.11-lts patch.patch       # Review for LTS branch
+    %(prog)s --send-email --to dev@dpdk.org series.mbox
+    %(prog)s --send-email --to dev@dpdk.org --dry-run series.mbox
+
+Large File Handling:
+    %(prog)s --split-patches series.mbox    # Review each patch separately
+    %(prog)s --split-patches --patch-range 1-5 series.mbox  # Review patches 1-5
+    %(prog)s --large-file=truncate patch.mbox   # Truncate to fit limit
+    %(prog)s --large-file=commits-only series.mbox  # Review commit messages only
+    %(prog)s --large-file=summary series.mbox   # High-level summary only
+    %(prog)s --large-file=chunk series.mbox     # Split and review in chunks
+
+Large File Modes:
+    error       - Fail with error (default)
+    truncate    - Truncate content to fit token limit
+    chunk       - Split into chunks and review each
+    commits-only - Extract and review only commit messages
+    summary     - Request high-level summary review
+
+LTS Releases:
+    Use -r/--release with LTS version (e.g., 24.11-lts, 23.11) to enable
+    stricter review rules: bug fixes only, no new features or APIs.
+    Any DPDK release with minor version .11 is an LTS release.
+        """,
+    )
+
+    parser.add_argument("patch_file", nargs="?", help="Patch file to analyze")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=4096,
+        help="Max tokens for response (default: 4096)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output",
+        metavar="FILE",
+        help="Write output to file instead of stdout",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Date and release options
+    parser.add_argument(
+        "-D",
+        "--date",
+        metavar="YYYY-MM-DD",
+        help="Review date context (default: today)",
+    )
+    parser.add_argument(
+        "-r",
+        "--release",
+        metavar="VERSION",
+        help="Target DPDK release (e.g., 24.11, 23.11-lts)",
+    )
+
+    # Large file handling options
+    large_group = parser.add_argument_group("Large File Handling")
+    large_group.add_argument(
+        "--large-file",
+        choices=LARGE_FILE_MODES,
+        default="error",
+        metavar="MODE",
+        help="How to handle large files: error (default), truncate, "
+        "chunk, commits-only, summary",
+    )
+    large_group.add_argument(
+        "--max-tokens",
+        type=int,
+        metavar="N",
+        help="Max input tokens (default: provider-specific)",
+    )
+    large_group.add_argument(
+        "--split-patches",
+        action="store_true",
+        help="Split mbox into individual patches and review each separately",
+    )
+    large_group.add_argument(
+        "--patch-range",
+        metavar="N-M",
+        help="Review only patches N through M (1-indexed, use with --split-patches)",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check patch file is provided
+    if not args.patch_file:
+        parser.error("patch_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    patch_path = Path(args.patch_file)
+    if not patch_path.exists():
+        error(f"Patch file not found: {args.patch_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Determine review date
+    review_date = args.date or date.today().isoformat()
+
+    # Build system prompt with date and release context
+    system_prompt = build_system_prompt(review_date, args.release)
+
+    # Read files
+    agents_content = agents_path.read_text()
+    patch_content = patch_path.read_text()
+    patch_name = patch_path.name
+
+    # Determine max tokens for this provider
+    max_input_tokens = args.max_tokens or PROVIDER_INPUT_LIMITS.get(
+        args.provider, 100000
+    )
+
+    # Estimate token count
+    estimated_tokens = estimate_tokens(patch_content + agents_content)
+
+    # Parse patch range if specified
+    patch_start, patch_end = None, None
+    if args.patch_range:
+        try:
+            if "-" in args.patch_range:
+                start, end = args.patch_range.split("-", 1)
+                patch_start = int(start)
+                patch_end = int(end)
+            else:
+                patch_start = patch_end = int(args.patch_range)
+        except ValueError:
+            error(f"Invalid --patch-range format: {args.patch_range}")
+
+    # Handle --split-patches mode
+    if args.split_patches:
+        patches = split_mbox_patches(patch_content)
+        total_patches = len(patches)
+
+        if total_patches == 1:
+            print(
+                "Note: Only 1 patch found in mbox, --split-patches has no effect",
+                file=sys.stderr,
+            )
+        else:
+            print(
+                f"Found {total_patches} patches in mbox",
+                file=sys.stderr,
+            )
+
+            # Apply patch range filter
+            if patch_start is not None:
+                if patch_start < 1 or patch_start > total_patches:
+                    error(
+                        f"Patch range start {patch_start} out of range (1-{total_patches})"
+                    )
+                if patch_end < patch_start or patch_end > total_patches:
+                    error(
+                        f"Patch range end {patch_end} out of range ({patch_start}-{total_patches})"
+                    )
+                patches = patches[patch_start - 1 : patch_end]
+                print(
+                    f"Reviewing patches {patch_start}-{patch_end} ({len(patches)} patches)",
+                    file=sys.stderr,
+                )
+
+            # Review each patch separately
+            all_reviews = []
+            for i, patch in enumerate(patches, patch_start or 1):
+                patch_label = f"Patch {i}/{total_patches}"
+                print(f"\nReviewing {patch_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    patch,
+                    f"{patch_name} ({patch_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((patch_label, review_text))
+
+            # Combine reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal API call
+            estimated_tokens = 0  # Bypass size check since we've already processed
+
+    # Check if content is too large
+    is_large = estimated_tokens > max_input_tokens
+
+    if is_large:
+        print(
+            f"Warning: Estimated {estimated_tokens:,} tokens exceeds limit of "
+            f"{max_input_tokens:,}",
+            file=sys.stderr,
+        )
+
+        if args.large_file == "error":
+            error(
+                f"Patch file too large ({estimated_tokens:,} tokens). "
+                f"Use --large-file=truncate|chunk|commits-only|summary to handle, "
+                f"or --split-patches to review patches individually."
+            )
+        elif args.large_file == "truncate":
+            print("Truncating content to fit token limit...", file=sys.stderr)
+            patch_content, was_truncated = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+            if was_truncated:
+                print("Content was truncated.", file=sys.stderr)
+        elif args.large_file == "commits-only":
+            print("Extracting commit messages only...", file=sys.stderr)
+            patch_content = extract_commit_messages(patch_content)
+            new_estimate = estimate_tokens(patch_content + agents_content)
+            print(
+                f"Reduced to ~{new_estimate:,} tokens (commit messages only)",
+                file=sys.stderr,
+            )
+            if new_estimate > max_input_tokens:
+                patch_content, _ = truncate_content(
+                    patch_content, max_input_tokens, args.provider
+                )
+        elif args.large_file == "summary":
+            print("Using summary mode for large patch...", file=sys.stderr)
+            system_prompt += get_summary_prompt()
+            patch_content, _ = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+        elif args.large_file == "chunk":
+            print("Processing in chunks...", file=sys.stderr)
+            all_reviews = []
+            for chunk, chunk_num, total_chunks in chunk_content(
+                patch_content, max_input_tokens, args.provider
+            ):
+                chunk_label = f"Chunk {chunk_num}/{total_chunks}"
+                print(f"Reviewing {chunk_label}...", file=sys.stderr)
+
+                review_text = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    chunk,
+                    f"{patch_name} ({chunk_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                all_reviews.append((chunk_label, review_text))
+
+            # Combine chunk reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal single API call below
+            estimated_tokens = 0
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Review date: {review_date}", file=sys.stderr)
+        if args.release:
+            lts_status = " (LTS)" if is_lts_release(args.release) else ""
+            print(f"Target release: {args.release}{lts_status}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Patch file: {args.patch_file}", file=sys.stderr)
+        print(f"Estimated tokens: {estimated_tokens:,}", file=sys.stderr)
+        print(f"Max input tokens: {max_input_tokens:,}", file=sys.stderr)
+        if args.large_file != "error":
+            print(f"Large file mode: {args.large_file}", file=sys.stderr)
+        if args.split_patches:
+            print("Split patches: yes", file=sys.stderr)
+        if args.output:
+            print(f"Output file: {args.output}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API (unless already processed via chunks/split)
+    if estimated_tokens > 0:  # Not already processed
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            args.output_format,
+            args.verbose,
+        )
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Format output based on requested format
+    provider_name = config["name"]
+
+    if args.output_format == "json":
+        # For JSON, try to parse and add metadata
+        try:
+            review_data = json.loads(review_text)
+        except json.JSONDecodeError:
+            # If AI didn't return valid JSON, wrap the text
+            review_data = {"raw_review": review_text}
+
+        output_data = {
+            "metadata": {
+                "patch_file": patch_name,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+                "review_date": review_date,
+                "target_release": args.release,
+                "is_lts": is_lts_release(args.release) if args.release else False,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+    elif args.output_format == "html":
+        # Wrap HTML content with header
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"<br>Target release: {args.release}{lts_badge}"
+        output_text = f"""<!-- AI-generated review of {patch_name} -->
+<!-- Reviewed using {provider_name} ({model}) on {review_date} -->
+<div class="patch-review">
+<h1>Patch Review: {patch_name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model}) on {review_date}{release_info}</p>
+{review_text}
+</div>
+"""
+    elif args.output_format == "markdown":
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"\n*Target release: {args.release}{lts_badge}*\n"
+        output_text = f"""# Patch Review: {patch_name}
+
+*Reviewed by {provider_name} ({model}) on {review_date}*
+{release_info}
+{review_text}
+"""
+    else:  # text
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+        output_text = f"=== Patch Review: {patch_name} (via {provider_name}) ===\n"
+        output_text += f"Review date: {review_date}\n"
+        output_text += release_info
+        output_text += "\n" + review_text
+
+    # Write output
+    if args.output:
+        Path(args.output).write_text(output_text)
+        print(f"Review written to: {args.output}", file=sys.stderr)
+    else:
+        print(output_text)
+
+    # Send email if requested
+    if args.send_email:
+        # Email always uses plain text - warn if different format requested
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        in_reply_to = get_last_message_id(patch_content)
+        orig_subject = get_last_subject(patch_content)
+
+        if orig_subject:
+            # Remove [PATCH n/m] prefix
+            review_subject = re.sub(r"^\[PATCH[^\]]*\]\s*", "", orig_subject)
+            review_subject = f"[REVIEW] {review_subject}"
+        else:
+            review_subject = f"[REVIEW] {patch_name}"
+
+        # Build email body - always use plain text version
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+
+        email_body = f"""AI-generated review of {patch_name}
+Reviewed using {provider_name} ({model}) on {review_date}
+{release_info}
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print(f"In-Reply-To: {in_reply_to}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            in_reply_to,
+            email_body,
+            args.dry_run,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v11 3/6] devtools: add compare-reviews.sh for multi-provider analysis
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-03-27 15:41     ` Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-27 15:41 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 192 ++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100755 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100755
index 0000000000..a63eeffb71
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,192 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -e
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+VERBOSE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            FORMAT="$2"
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"
+        echo ""
+        echo "Saved to: $OUTPUT_FILE"
+    else
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE"
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v11 4/6] devtools: add multi-provider AI documentation review script
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (2 preceding siblings ...)
  2026-03-27 15:41     ` [PATCH v11 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
@ 2026-03-27 15:41     ` Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-27 15:41 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add review-doc.py script that reviews DPDK documentation files for
spelling, grammar, technical correctness, and clarity using AI
language models. Supports batch processing of multiple files.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

Output formats (-f/--format):
  - text: plain text with extractable diff/msg markers (default)
  - markdown: formatted review document
  - html: complete HTML document with styling
  - json: structured data with metadata

For each input file, the script produces:
  - <basename>.{txt,md,html,json}: review in selected format
  - <basename>.diff: unified diff (text/json, or with -d flag)
  - <basename>.msg: commit message (text/json, or with -d flag)

The commit message prefix is automatically determined from the
file path (e.g., doc/guides/prog_guide: for programmer's guide).

Features:
  - Multiple file processing with glob support
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Configurable output directory via -o/--output-dir option
  - Output format selection via -f/--format option
  - Force diff/msg generation via -d/--diff option
  - Quiet mode (-q) suppresses stdout output
  - Verbose mode (-v) shows token usage and API details
  - Email integration using git sendemail configuration
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/review-doc.py doc/guides/prog_guide/mempool_lib.rst
  ./devtools/review-doc.py doc/guides/nics/*.rst
  ./devtools/review-doc.py -f html -d -o /tmp doc/guides/nics/*.rst
  ./devtools/review-doc.py --send-email --to dev@dpdk.org file.rst

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/review-doc.py | 1099 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1099 insertions(+)
 create mode 100755 devtools/review-doc.py

diff --git a/devtools/review-doc.py b/devtools/review-doc.py
new file mode 100755
index 0000000000..c8a1988a10
--- /dev/null
+++ b/devtools/review-doc.py
@@ -0,0 +1,1099 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Review DPDK documentation files using AI providers.
+
+Produces a diff file and commit message compliant with DPDK standards.
+Accepts multiple documentation files and generates output for each.
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import getpass
+import json
+import os
+import re
+import smtplib
+import ssl
+import subprocess
+import sys
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Map output format to file extension
+FORMAT_EXTENSIONS = {
+    "text": ".txt",
+    "markdown": ".md",
+    "html": ".html",
+    "json": ".json",
+}
+
+# Additional markers for extracting diff/msg (used with --diff flag)
+DIFF_MARKERS_INSTRUCTION = """
+
+ADDITIONALLY, at the end of your response, include these exact markers for automated extraction:
+---COMMIT_MESSAGE_START---
+(same commit message as above)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(same unified diff as above)
+---UNIFIED_DIFF_END---
+"""
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# Commit prefix mappings based on file path
+COMMIT_PREFIX_MAP = [
+    ("doc/guides/prog_guide/", "doc/guides/prog_guide:"),
+    ("doc/guides/sample_app_ug/", "doc/guides/sample_app:"),
+    ("doc/guides/nics/", "doc/guides/nics:"),
+    ("doc/guides/cryptodevs/", "doc/guides/cryptodevs:"),
+    ("doc/guides/compressdevs/", "doc/guides/compressdevs:"),
+    ("doc/guides/eventdevs/", "doc/guides/eventdevs:"),
+    ("doc/guides/rawdevs/", "doc/guides/rawdevs:"),
+    ("doc/guides/bbdevs/", "doc/guides/bbdevs:"),
+    ("doc/guides/gpus/", "doc/guides/gpus:"),
+    ("doc/guides/dmadevs/", "doc/guides/dmadevs:"),
+    ("doc/guides/regexdevs/", "doc/guides/regexdevs:"),
+    ("doc/guides/mldevs/", "doc/guides/mldevs:"),
+    ("doc/guides/rel_notes/", "doc/guides/rel_notes:"),
+    ("doc/guides/linux_gsg/", "doc/guides/linux_gsg:"),
+    ("doc/guides/freebsd_gsg/", "doc/guides/freebsd_gsg:"),
+    ("doc/guides/windows_gsg/", "doc/guides/windows_gsg:"),
+    ("doc/guides/tools/", "doc/guides/tools:"),
+    ("doc/guides/testpmd_app_ug/", "doc/guides/testpmd:"),
+    ("doc/guides/howto/", "doc/guides/howto:"),
+    ("doc/guides/contributing/", "doc/guides/contributing:"),
+    ("doc/guides/platform/", "doc/guides/platform:"),
+    ("doc/guides/", "doc:"),
+    ("doc/api/", "doc/api:"),
+    ("doc/", "doc:"),
+]
+
+SYSTEM_PROMPT = """\
+You are an expert technical documentation reviewer for DPDK.
+Your task is to review documentation files and suggest improvements for:
+- Spelling errors
+- Grammar issues
+- Technical correctness
+- Clarity and readability
+- Consistency with DPDK terminology
+
+IMPORTANT COMMIT MESSAGE RULES (from check-git-log.sh):
+- Subject line MUST be ≤60 characters
+- Format: "prefix: lowercase description"
+- First word after colon must be lowercase (except acronyms like Rx, Tx, VF, MAC, API)
+- Use imperative mood (e.g., "fix typo" not "fixed typo" or "fixes typo")
+- NO trailing period on subject line
+- NO punctuation marks: , ; ! ? & |
+- NO underscores in subject after colon
+- Body lines wrapped at 75 characters
+- Body must NOT start with "It"
+- Do NOT include Signed-off-by (user adds via git commit --sign)
+- Only use "Fixes:" tag for actual errors in documentation, not style improvements
+
+Case-sensitive terms (must use exact case):
+- Rx, Tx (not RX, TX, rx, tx)
+- VF, PF (not vf, pf)
+- MAC, VLAN, RSS, API
+- Linux, Windows, FreeBSD
+
+For style/clarity improvements, do NOT use Fixes tag.
+For actual errors (wrong information, broken examples), include Fixes tag \
+if you can identify the commit."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """
+OUTPUT FORMAT:
+You must output exactly two sections:
+
+1. COMMIT_MESSAGE section containing the complete commit message
+2. UNIFIED_DIFF section containing the unified diff
+
+Use these exact markers:
+---COMMIT_MESSAGE_START---
+(commit message here)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(unified diff here)
+---UNIFIED_DIFF_END---
+
+The diff should be in unified format that can be applied with "git apply".
+If no changes are needed, output empty sections with a note.""",
+    "markdown": """
+OUTPUT FORMAT:
+Provide your review in Markdown format with:
+
+## Summary
+Brief description of changes
+
+## Commit Message
+```
+(complete commit message here, ready to use)
+```
+
+## Changes
+For each change:
+### Issue N: Brief title
+- **Location**: file path and line
+- **Problem**: description
+- **Fix**: suggested correction
+
+## Unified Diff
+```diff
+(unified diff here)
+```""",
+    "html": """
+OUTPUT FORMAT:
+Provide your review in HTML format with:
+- <h2> for sections (Summary, Commit Message, Changes, Diff)
+- <pre><code> for commit message and diff
+- <ul>/<li> for individual issues
+- Do NOT include <html>, <head>, or <body> tags - just the content
+
+Include sections for: Summary, Commit Message, Changes, Unified Diff""",
+    "json": """
+OUTPUT FORMAT:
+Provide your review as JSON with this structure:
+{
+  "summary": "Brief description of changes",
+  "commit_message": "Complete commit message ready to use",
+  "changes": [
+    {
+      "type": "spelling|grammar|technical|clarity|style",
+      "location": "line number or section",
+      "original": "original text",
+      "suggested": "corrected text",
+      "reason": "why this change"
+    }
+  ],
+  "diff": "unified diff as a string",
+  "stats": {
+    "total_issues": 0,
+    "spelling": 0,
+    "grammar": 0,
+    "technical": 0,
+    "clarity": 0
+  }
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """\
+Review the following DPDK documentation file and provide improvements.
+
+File path: {doc_file}
+Commit message prefix to use: {commit_prefix}
+
+{format_instruction}
+
+---DOCUMENT CONTENT---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_smtp_config() -> dict[str, Any]:
+    """Get SMTP configuration from git config sendemail settings."""
+    config = {
+        "server": get_git_config("sendemail.smtpserver"),
+        "port": get_git_config("sendemail.smtpserverport"),
+        "user": get_git_config("sendemail.smtpuser"),
+        "encryption": get_git_config("sendemail.smtpencryption"),
+        "password": get_git_config("sendemail.smtppass"),
+    }
+
+    # Set defaults
+    if not config["port"]:
+        if config["encryption"] == "ssl":
+            config["port"] = "465"
+        else:
+            config["port"] = "587"
+
+    # Convert port to int
+    if config["port"]:
+        config["port"] = int(config["port"])
+
+    return config
+
+
+def get_commit_prefix(filepath: str) -> str:
+    """Determine commit message prefix from file path."""
+    for prefix_path, prefix in COMMIT_PREFIX_MAP:
+        if filepath.startswith(prefix_path):
+            return prefix
+    return "doc:"
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": SYSTEM_PROMPT}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + doc_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+    verbose: bool = False,
+) -> str:
+    """Make API request to the specified provider."""
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Show verbose info
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        if provider == "anthropic":
+            usage = result.get("usage", {})
+            print(f"Input tokens: {usage.get('input_tokens', 'N/A')}", file=sys.stderr)
+            print(
+                f"Cache creation: " f"{usage.get('cache_creation_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Cache read: {usage.get('cache_read_input_tokens', 0)}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('output_tokens', 'N/A')}", file=sys.stderr
+            )
+        elif provider == "google":
+            usage = result.get("usageMetadata", {})
+            print(
+                f"Prompt tokens: {usage.get('promptTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+            print(
+                f"Output tokens: {usage.get('candidatesTokenCount', 'N/A')}",
+                file=sys.stderr,
+            )
+        else:  # openai, xai
+            usage = result.get("usage", {})
+            print(
+                f"Prompt tokens: {usage.get('prompt_tokens', 'N/A')}", file=sys.stderr
+            )
+            print(
+                f"Completion tokens: " f"{usage.get('completion_tokens', 'N/A')}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        return "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        return "".join(part.get("text", "") for part in parts)
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        return choices[0].get("message", {}).get("content", "")
+
+
+def parse_review_text(review_text: str) -> tuple[str, str]:
+    """Extract commit message and diff from text format response."""
+    commit_msg = ""
+    diff = ""
+
+    # Extract commit message
+    msg_match = re.search(
+        r"---COMMIT_MESSAGE_START---\s*\n(.*?)\n---COMMIT_MESSAGE_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if msg_match:
+        commit_msg = msg_match.group(1).strip()
+
+    # Extract unified diff
+    diff_match = re.search(
+        r"---UNIFIED_DIFF_START---\s*\n(.*?)\n---UNIFIED_DIFF_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if diff_match:
+        diff = diff_match.group(1).strip()
+        # Clean up any markdown code fence if present
+        diff = re.sub(r"^```diff\s*\n?", "", diff)
+        diff = re.sub(r"\n?```\s*$", "", diff)
+
+    return commit_msg, diff
+
+
+def strip_diff_markers(text: str) -> str:
+    """Remove the diff/msg extraction markers from text."""
+    # Remove commit message markers and content
+    text = re.sub(
+        r"\n*---COMMIT_MESSAGE_START---\s*\n.*?\n---COMMIT_MESSAGE_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    # Remove unified diff markers and content
+    text = re.sub(
+        r"\n*---UNIFIED_DIFF_START---\s*\n.*?\n---UNIFIED_DIFF_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    return text.strip()
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+    verbose: bool = False,
+) -> bool:
+    """Send review email via SMTP using git sendemail config."""
+    # Build email message
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(msg.as_string(), file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Get SMTP configuration from git config
+    smtp_config = get_smtp_config()
+
+    if not smtp_config["server"]:
+        error("No SMTP server configured. Set git config sendemail.smtpserver")
+
+    server = smtp_config["server"]
+    port = smtp_config["port"]
+    user = smtp_config["user"]
+    encryption = smtp_config["encryption"]
+
+    # Get password from environment or git config, or prompt
+    password = os.environ.get("SMTP_PASSWORD") or smtp_config["password"]
+    if user and not password:
+        password = getpass.getpass(f"SMTP password for {user}@{server}: ")
+
+    if verbose:
+        print(f"SMTP server: {server}:{port}", file=sys.stderr)
+        print(f"SMTP user: {user or '(none)'}", file=sys.stderr)
+        print(f"Encryption: {encryption or 'starttls'}", file=sys.stderr)
+
+    # Collect all recipients
+    all_recipients = list(to_addrs)
+    if cc_addrs:
+        all_recipients.extend(cc_addrs)
+
+    try:
+        if encryption == "ssl":
+            # SSL/TLS connection from the start (port 465)
+            context = ssl.create_default_context()
+            with smtplib.SMTP_SSL(server, port, context=context) as smtp:
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+        else:
+            # STARTTLS (port 587) or plain (port 25)
+            with smtplib.SMTP(server, port) as smtp:
+                smtp.ehlo()
+                if encryption == "tls" or port == 587:
+                    context = ssl.create_default_context()
+                    smtp.starttls(context=context)
+                    smtp.ehlo()
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+
+        print(f"Email sent via SMTP ({server}:{port})", file=sys.stderr)
+        return True
+
+    except smtplib.SMTPAuthenticationError as e:
+        error(f"SMTP authentication failed: {e}")
+    except smtplib.SMTPException as e:
+        error(f"SMTP error: {e}")
+    except OSError as e:
+        error(f"Connection error to {server}:{port}: {e}")
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Review DPDK documentation files using AI providers. "
+        "Accepts multiple files and generates output for each.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s doc/guides/prog_guide/mempool_lib.rst
+    %(prog)s doc/guides/nics/*.rst              # Review all NIC docs
+    %(prog)s -p openai -o /tmp doc/guides/nics/ixgbe.rst doc/guides/nics/i40e.rst
+    %(prog)s -f html -d -o /tmp/reviews doc/guides/nics/*.rst  # HTML + diff files
+    %(prog)s -f json -o /tmp doc/guides/howto/flow_bifurcation.rst
+    %(prog)s --send-email --to dev@dpdk.org doc/guides/nics/ixgbe.rst
+
+Output files (in output-dir):
+    <basename>.txt|.md|.html|.json  Review in selected format
+    <basename>.diff                  Unified diff (text/json, or with --diff)
+    <basename>.msg                   Commit message (text/json, or with --diff)
+
+After review:
+    git apply <basename>.diff
+    git commit -sF <basename>.msg
+
+SMTP Configuration (from git config):
+    sendemail.smtpserver      SMTP server hostname
+    sendemail.smtpserverport  SMTP port (default: 587 for TLS, 465 for SSL)
+    sendemail.smtpuser        SMTP username
+    sendemail.smtpencryption  'tls' for STARTTLS, 'ssl' for SSL/TLS
+    sendemail.smtppass        SMTP password (or set SMTP_PASSWORD env var)
+
+Example git config:
+    git config --global sendemail.smtpserver smtp.gmail.com
+    git config --global sendemail.smtpserverport 587
+    git config --global sendemail.smtpuser yourname@gmail.com
+    git config --global sendemail.smtpencryption tls
+        """,
+    )
+
+    parser.add_argument(
+        "doc_files",
+        nargs="+",
+        metavar="doc_file",
+        help="Documentation file(s) to review",
+    )
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=8192,
+        help="Max tokens for response (default: 8192)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output-dir",
+        default=".",
+        help="Output directory for all output files (default: .)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-q",
+        "--quiet",
+        action="store_true",
+        help="Suppress review output to stdout (only write files)",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-d",
+        "--diff",
+        action="store_true",
+        help="Always produce .diff and .msg files (automatic for text/json)",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    # Validate all doc files exist before processing
+    doc_paths = []
+    for doc_file in args.doc_files:
+        doc_path = Path(doc_file)
+        if not doc_path.exists():
+            error(f"Documentation file not found: {doc_file}")
+        doc_paths.append((doc_file, doc_path))
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Read AGENTS.md once
+    agents_content = agents_path.read_text()
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    provider_name = config["name"]
+
+    # Process each file
+    num_files = len(doc_paths)
+    for file_idx, (doc_file, doc_path) in enumerate(doc_paths, 1):
+        if num_files > 1:
+            print(
+                f"\n{'=' * 60}",
+                file=sys.stderr,
+            )
+            print(
+                f"Processing file {file_idx}/{num_files}: {doc_file}",
+                file=sys.stderr,
+            )
+            print(
+                f"{'=' * 60}",
+                file=sys.stderr,
+            )
+
+        # Determine output filenames
+        doc_basename = doc_path.stem
+        diff_file = output_dir / f"{doc_basename}.diff"
+        msg_file = output_dir / f"{doc_basename}.msg"
+
+        # Get commit prefix
+        commit_prefix = get_commit_prefix(doc_file)
+
+        # Read doc content
+        doc_content = doc_path.read_text()
+
+        if args.verbose:
+            print("=== Request ===", file=sys.stderr)
+            print(f"Provider: {args.provider}", file=sys.stderr)
+            print(f"Model: {model}", file=sys.stderr)
+            print(f"Output format: {args.output_format}", file=sys.stderr)
+            print(f"AGENTS file: {args.agents}", file=sys.stderr)
+            print(f"Doc file: {doc_file}", file=sys.stderr)
+            print(f"Commit prefix: {commit_prefix}", file=sys.stderr)
+            print(f"Output dir: {args.output_dir}", file=sys.stderr)
+            if args.send_email:
+                print("Send email: yes", file=sys.stderr)
+                print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+                if args.cc_addrs:
+                    print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+                print(f"From: {from_addr}", file=sys.stderr)
+            print("===============", file=sys.stderr)
+
+        # Call API
+        review_text = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            args.output_format,
+            args.diff,
+            args.verbose,
+        )
+
+        if not review_text:
+            print(
+                f"Warning: No response received for {doc_file}",
+                file=sys.stderr,
+            )
+            continue
+
+        # Determine review output file
+        format_ext = FORMAT_EXTENSIONS[args.output_format]
+        review_file = output_dir / f"{doc_basename}{format_ext}"
+
+        # Determine if we should write diff/msg files
+        write_diff_msg = args.diff or args.output_format in ("text", "json")
+
+        # Extract commit message and diff first (before stripping markers)
+        commit_msg, diff = "", ""
+        if write_diff_msg:
+            if args.output_format == "json":
+                # Will extract from JSON below
+                pass
+            else:
+                # Parse from text format markers
+                commit_msg, diff = parse_review_text(review_text)
+
+        # For non-text formats with --diff, strip the markers from display output
+        display_text = review_text
+        if args.diff and args.output_format in ("markdown", "html"):
+            display_text = strip_diff_markers(review_text)
+
+        # Build formatted output text
+        if args.output_format == "text":
+            output_text = review_text
+        elif args.output_format == "json":
+            # Try to parse JSON response
+            try:
+                review_data = json.loads(review_text)
+            except json.JSONDecodeError:
+                print("Warning: Response is not valid JSON", file=sys.stderr)
+                review_data = {"raw_response": review_text}
+
+            # Extract diff/msg from JSON if present
+            if write_diff_msg:
+                if isinstance(review_data, dict) and "raw_response" not in review_data:
+                    commit_msg = review_data.get("commit_message", "")
+                    diff = review_data.get("diff", "")
+
+            # Add metadata
+            output_data = {
+                "metadata": {
+                    "doc_file": doc_file,
+                    "provider": args.provider,
+                    "provider_name": provider_name,
+                    "model": model,
+                    "commit_prefix": commit_prefix,
+                },
+                "review": review_data,
+            }
+            output_text = json.dumps(output_data, indent=2)
+        elif args.output_format == "markdown":
+            output_text = f"""# Documentation Review: {doc_path.name}
+
+*Reviewed by {provider_name} ({model})*
+
+{display_text}
+"""
+        elif args.output_format == "html":
+            output_text = f"""<!DOCTYPE html>
+<html>
+<head>
+<meta charset="utf-8">
+<title>Review: {doc_path.name}</title>
+<style>
+body {{ font-family: system-ui, sans-serif; max-width: 900px; margin: 2em auto; padding: 0 1em; }}
+h1 {{ color: #333; }}
+.review-meta {{ color: #666; font-style: italic; }}
+pre {{ background: #f5f5f5; padding: 1em; overflow-x: auto; }}
+</style>
+</head>
+<body>
+<h1>Documentation Review: {doc_path.name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+<div class="review-content">
+{display_text}
+</div>
+</body>
+</html>
+"""
+
+        # Write formatted review to file
+        review_file.write_text(output_text)
+        print(f"Review written to: {review_file}", file=sys.stderr)
+
+        # Write diff/msg files
+        if write_diff_msg:
+            if commit_msg:
+                msg_file.write_text(commit_msg + "\n")
+                print(f"Commit message written to: {msg_file}", file=sys.stderr)
+            else:
+                msg_file.write_text("# No commit message generated\n")
+                print("Warning: Could not extract commit message", file=sys.stderr)
+
+            if diff:
+                diff_file.write_text(diff + "\n")
+                print(f"Diff written to: {diff_file}", file=sys.stderr)
+            else:
+                diff_file.write_text("# No changes suggested\n")
+                print("Warning: Could not extract diff", file=sys.stderr)
+
+        # Print to stdout unless quiet (or multiple files without verbose)
+        show_stdout = not args.quiet and (num_files == 1 or args.verbose)
+        if show_stdout:
+            print(
+                f"\n=== Documentation Review: {doc_path.name} "
+                f"(via {provider_name}) ==="
+            )
+            print(output_text)
+
+            # Print usage instructions for text format
+            if args.output_format == "text":
+                print("\n=== Output Files ===")
+                print(f"Commit message: {msg_file}")
+                print(f"Diff file:      {diff_file}")
+                print("\nTo apply changes:")
+                print(f"  git apply {diff_file}")
+                print(f"  git commit -sF {msg_file}")
+
+        # Send email if requested
+        if args.send_email:
+            if args.output_format != "text":
+                print(
+                    f"Note: Email will be sent as plain text regardless of "
+                    f"--format={args.output_format}",
+                    file=sys.stderr,
+                )
+
+            review_subject = f"[REVIEW] {commit_prefix} {doc_path.name}"
+
+            # Build email body
+            email_body = f"""AI-generated documentation review of {doc_file}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+            if args.verbose:
+                print("", file=sys.stderr)
+                print("=== Email Details ===", file=sys.stderr)
+                print(f"Subject: {review_subject}", file=sys.stderr)
+                print("=====================", file=sys.stderr)
+
+            send_email(
+                args.to_addrs,
+                args.cc_addrs,
+                from_addr,
+                review_subject,
+                None,
+                email_body,
+                args.dry_run,
+                args.verbose,
+            )
+
+            if not args.dry_run:
+                print("", file=sys.stderr)
+                print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+    # Print summary for multiple files
+    if num_files > 1:
+        print(f"\n{'=' * 60}", file=sys.stderr)
+        print(f"Processed {num_files} files", file=sys.stderr)
+        print(f"Output directory: {output_dir}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v11 5/6] doc: add AI-assisted patch review to contributing guide
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (3 preceding siblings ...)
  2026-03-27 15:41     ` [PATCH v11 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-03-27 15:41     ` Stephen Hemminger
  2026-03-27 15:41     ` [PATCH v11 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-27 15:41 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a new section to the contributing guide describing the
analyze-patch.py script which uses AI providers to review patches
against DPDK coding standards before submission to the mailing list.

The new section covers basic usage, provider selection, patch series
handling, LTS release review, and output format options. A note
clarifies that AI review supplements but does not replace human
review.

Also add a reference to the script in the new driver guide's
test tools checklist.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/contributing/new_driver.rst |  2 +
 doc/guides/contributing/patches.rst    | 59 ++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/guides/contributing/new_driver.rst b/doc/guides/contributing/new_driver.rst
index 555e875329..6c0d356cfd 100644
--- a/doc/guides/contributing/new_driver.rst
+++ b/doc/guides/contributing/new_driver.rst
@@ -210,3 +210,5 @@ Be sure to run the following test tools per patch in a patch series:
 * `check-doc-vs-code.sh`
 * `check-spdx-tag.sh`
 * Build documentation and validate how output looks
+* Optionally run ``analyze-patch.py`` for AI-assisted review
+  (see :ref:`ai_assisted_review` in the Contributing Guide)
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index 5f554d47e6..1e50799c19 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -183,6 +183,10 @@ Make your planned changes in the cloned ``dpdk`` repo. Here are some guidelines
 
 * Code and related documentation must be updated atomically in the same patch.
 
+* Consider running the :ref:`AI-assisted review <ai_assisted_review>` tool
+  before submitting to catch common issues early.
+  This is encouraged but not required.
+
 Once the changes have been made you should commit them to your local repo.
 
 For small changes, that do not require specific explanations, it is better to keep things together in the
@@ -503,6 +507,61 @@ Additionally, when contributing to the DTS tool, patches should also be checked
 the ``dts-check-format.sh`` script in the ``devtools`` directory of the DPDK repo.
 To run the script, extra :ref:`Python dependencies <dts_deps>` are needed.
 
+
+.. _ai_assisted_review:
+
+AI-Assisted Patch Review
+------------------------
+
+Contributors may optionally use the ``devtools/analyze-patch.py`` script
+to get an AI-assisted review of patches before submitting them to the mailing list.
+The script checks patches against the DPDK coding standards and contribution
+guidelines documented in ``AGENTS.md``.
+
+The script supports multiple AI providers (Anthropic Claude, OpenAI ChatGPT,
+xAI Grok, Google Gemini).  An API key for the chosen provider must be set
+in the corresponding environment variable (see ``--list-providers``).
+
+Basic usage::
+
+   # Review a single patch (default provider: Anthropic Claude)
+   devtools/analyze-patch.py my-patch.patch
+
+   # Use a different provider
+   devtools/analyze-patch.py -p openai my-patch.patch
+
+   # Review for an LTS branch (enables stricter rules)
+   devtools/analyze-patch.py -r 24.11 my-patch.patch
+
+   # List available providers and their API key variables
+   devtools/analyze-patch.py --list-providers
+
+For a patch series in an mbox file, the ``--split-patches`` option reviews
+each patch individually::
+
+   devtools/analyze-patch.py --split-patches series.mbox
+
+   # Review only a range of patches
+   devtools/analyze-patch.py --split-patches --patch-range 1-5 series.mbox
+
+When reviewing for a Long Term Stable (LTS) release, use the ``-r`` option
+with the target version.  Any DPDK release with minor version ``.11``
+(e.g., 23.11, 24.11) is automatically recognized as LTS,
+and the script will enforce stricter rules: bug fixes only, no new features or APIs.
+
+Output can be formatted as plain text (default), Markdown, HTML, or JSON::
+
+   devtools/analyze-patch.py -f markdown -o review.md my-patch.patch
+
+The review guidelines in ``AGENTS.md`` focus on correctness bug detection
+and other DPDK-specific requirements. Commit message formatting and
+SPDX/copyright compliance are checked by ``checkpatches.sh`` and are
+not duplicated in the AI review.
+
+.. note::
+
+   Always verify AI suggestions before acting on them.
+
 .. _contrib_check_compilation:
 
 Checking Compilation
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v11 6/6] MAINTAINERS: add section for AI review tools
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (4 preceding siblings ...)
  2026-03-27 15:41     ` [PATCH v11 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
@ 2026-03-27 15:41     ` Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-03-27 15:41 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add maintainer entries for the AI-assisted code review tooling:
AGENTS.md, analyze-patch.py, compare-reviews.sh, and
review-doc.py.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0f5539f851..c052b6c203 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -109,6 +109,14 @@ F: license/
 F: .editorconfig
 F: .mailmap
 
+AI review tools
+M: Stephen Hemminger <stephen@networkplumber.org>
+M: Aaron Conole <aconole@redhat.com>
+F: AGENTS.md
+F: devtools/analyze-patch.py
+F: devtools/compare-reviews.sh
+F: devtools/review-doc.py
+
 Linux kernel uAPI headers
 M: Maxime Coquelin <maxime.coquelin@redhat.com>
 F: devtools/linux-uapi.sh
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
                     ` (7 preceding siblings ...)
  2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-04-01 15:38   ` Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
                       ` (5 more replies)
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  9 siblings, 6 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-01 15:38 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add guidelines and tooling for AI-assisted code review of DPDK
patches.

AGENTS.md provides a two-tier review framework: correctness bugs
(resource leaks, use-after-free, race conditions) are reported at
>=50% confidence; style issues require >80% with false positive
suppression. Mechanical checks handled by checkpatches.sh are
excluded to avoid redundant findings.

The analyze-patch.py and review-doc.py scripts support multiple AI
providers (Anthropic, OpenAI, xAI, Google) with mbox splitting,
prompt caching, direct SMTP sending, and token usage tracking with
optional cost estimation.

v12 - add token usage metrics to analyze-patch.py and review-doc.py
      call_api() returns structured TokenUsage alongside response
      always print token summary (input/output/cache) to stderr
      add -c/--show-costs for per-provider cost estimation
      include token_usage in JSON metadata output

v11 - add more checks related VLAN and mtu
      add checks for unsigned overflow on shifts

v10 - add more checks about mtu, buffer size and scatter
      based of Ferruh's revision in 2024.

v9 - update AGENTS to reduce false positives
   - remove commit message/SPDX items from prompt (checkpatch's job).
   - update contributing guide text to match actual AGENTS.md coverage.

Stephen Hemminger (6):
  doc: add AGENTS.md for AI code review tools
  devtools: add multi-provider AI patch review script
  devtools: add compare-reviews.sh for multi-provider analysis
  devtools: add multi-provider AI documentation review script
  doc: add AI-assisted patch review to contributing guide
  MAINTAINERS: add section for AI review tools

 AGENTS.md                              | 2162 ++++++++++++++++++++++++
 MAINTAINERS                            |    8 +
 devtools/analyze-patch.py              | 1528 +++++++++++++++++
 devtools/compare-reviews.sh            |  192 +++
 devtools/review-doc.py                 | 1277 ++++++++++++++
 doc/guides/contributing/new_driver.rst |    2 +
 doc/guides/contributing/patches.rst    |   59 +
 7 files changed, 5228 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 devtools/analyze-patch.py
 create mode 100755 devtools/compare-reviews.sh
 create mode 100755 devtools/review-doc.py

-- 
2.53.0


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v12 1/6] doc: add AGENTS.md for AI code review tools
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-04-01 15:38     ` Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-01 15:38 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Provide structured guidelines for AI tools reviewing DPDK
patches. Focuses on correctness bug detection (resource leaks,
use-after-free, race conditions), C coding style, forbidden
tokens, API conventions, and severity classifications.

Mechanical checks already handled by checkpatches.sh (SPDX
format, commit message formatting, tag ordering) are excluded
to avoid redundant and potentially contradictory findings.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 AGENTS.md | 2162 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2162 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..d49ed859f1
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,2162 @@
+# AGENTS.md - DPDK Code Review Guidelines for AI Tools
+
+## CRITICAL INSTRUCTION - READ FIRST
+
+This document has two categories of review rules with different
+confidence thresholds:
+
+### 1. Correctness Bugs -- HIGHEST PRIORITY (report at >=50% confidence)
+
+**Always report potential correctness bugs.** These are the most
+valuable findings. When in doubt, report them with a note about
+your confidence level. A possible use-after-free or resource leak
+is worth mentioning even if you are not certain.
+
+Correctness bugs include:
+- Use-after-free (accessing memory after `free`/`rte_free`)
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference
+- Buffer overflows or out-of-bounds access
+- Uninitialized variable use in a reachable code path
+- Race conditions (unsynchronized shared state)
+- `volatile` used instead of atomic operations for inter-thread shared variables
+- `__atomic_load_n()`/`__atomic_store_n()`/`__atomic_*()` GCC built-ins instead of `rte_atomic_*_explicit()`
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` legacy barriers instead of `rte_atomic_thread_fence()`
+- Missing error checks on functions that can fail
+- Error paths that skip cleanup (goto labels, missing free/close)
+- Incorrect error propagation (wrong return value, lost errno)
+- Logic errors in conditionals (wrong operator, inverted test)
+- Integer overflow/truncation in size calculations
+- Missing bounds checks on user-supplied sizes or indices
+- `mmap()` return checked against `NULL` instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=`
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied data
+- `1 << n` on 64-bit bitmask (must use `1ULL << n` or `RTE_BIT64()`)
+- Left shift of narrow unsigned (`uint8_t`/`uint16_t`) used as 64-bit value (sign extension via implicit `int` promotion)
+- Variable assigned then overwritten before being read (dead store)
+- Same variable used as loop counter in nested loops
+- `memcpy`/`memcmp`/`memset` with same pointer for source and destination (no-op or undefined)
+- `rte_mbuf_raw_free_bulk()` called on mbufs that may originate from different mempools (Tx burst, ring dequeue)
+- MTU confused with frame length (MTU is L3 payload; frame length = MTU + L2 overhead)
+- Using `dev_conf.rxmode.mtu` after configure instead of `dev->data->mtu`
+- Hardcoded Ethernet overhead instead of per-device calculation
+- MTU set without enabling `RTE_ETH_RX_OFFLOAD_SCATTER` when frame size exceeds mbuf data room
+- `mtu_set` callback rejects valid MTU when scatter Rx is already enabled
+- Rx queue setup silently drops oversized packets instead of enabling scatter or returning an error
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size check
+
+**Do NOT self-censor correctness bugs.** If you identify a code
+path where a resource could leak or memory could be used after
+free, report it. Do not talk yourself out of it.
+
+### 2. Style, Process, and Formatting -- suppress false positives
+
+**NEVER list a style/process item under "Errors" or "Warnings" if
+you conclude it is correct.**
+
+Before outputting any style, formatting, or process error/warning,
+verify it is actually wrong. If your analysis concludes with
+phrases like "there's no issue here", "which is fine", "appears
+correct", "is acceptable", or "this is actually correct" -- then
+DO NOT INCLUDE IT IN YOUR OUTPUT AT ALL. Delete it. Omit it
+entirely.
+
+This suppression rule applies to: naming conventions,
+code style, and process compliance. It does NOT apply to
+correctness bugs listed above. (SPDX/copyright format and
+commit message formatting are handled by checkpatch and are
+excluded from AI review entirely.)
+
+---
+
+This document provides guidelines for AI-powered code review tools
+when reviewing contributions to the Data Plane Development Kit
+(DPDK). It is derived from the official DPDK contributor guidelines
+and validation scripts.
+
+## Overview
+
+DPDK follows a development process modeled on the Linux Kernel. All
+patches are reviewed publicly on the mailing list before being
+merged. AI review tools should verify compliance with the standards
+outlined below.
+
+## Review Philosophy
+
+**Correctness bugs are the primary goal of AI review.** Style and
+formatting checks are secondary. A review that catches a
+use-after-free but misses a style nit is far more valuable than
+one that catches every style issue but misses the bug.
+
+**BEFORE OUTPUTTING YOUR REVIEW**: Re-read each item.
+- For correctness bugs: keep them. If you have reasonable doubt
+  that a code path is safe, report it.
+- For style/process items: if ANY item contains phrases like "is
+  fine", "no issue", "appears correct", "is acceptable",
+  "actually correct" -- DELETE THAT ITEM. Do not include it.
+
+### Correctness review guidelines
+- Trace error paths: for every function that allocates a resource
+  or acquires a lock, verify that ALL error paths after that point
+  release it
+- Check every `goto error` and early `return`: does it clean up
+  everything allocated so far?
+- Look for use-after-free: after `free(p)`, is `p` accessed again?
+- Check that error codes are propagated, not silently dropped
+- Report at >=50% confidence; note uncertainty if appropriate
+- It is better to report a potential bug that turns out to be safe
+  than to miss a real bug
+
+### Style and process review guidelines
+- Only comment on style/process issues when you have HIGH CONFIDENCE (>80%) that an issue exists
+- Be concise: one sentence per comment when possible
+- Focus on actionable feedback, not observations
+- When reviewing text, only comment on clarity issues if the text is genuinely
+  confusing or could lead to errors.
+- Do NOT comment on copyright years, SPDX format, or copyright holders - not subject to AI review
+- Do NOT report an issue then contradict yourself - if something is acceptable, do not mention it at all
+- Do NOT include items in Errors/Warnings that you then say are "acceptable" or "correct"
+- Do NOT mention things that are correct or "not an issue" - only report actual problems
+- Do NOT speculate about contributor circumstances (employment, company policies, etc.)
+- Before adding any style item to your review, ask: "Is this actually wrong?" If no, omit it entirely.
+- NEVER write "(Correction: ...)" - if you need to correct yourself, simply omit the item entirely
+- Do NOT add vague suggestions like "should be verified" or "should be checked" - either it's wrong or don't mention it
+- Do NOT flag something as an Error then say "which is correct" in the same item
+- Do NOT say "no issue here" or "this is actually correct" - if there's no issue, do not include it in your review
+- Do NOT analyze cross-patch dependencies or compilation order - you cannot reliably determine this from patch review
+- Do NOT claim a patch "would cause compilation failure" based on symbols used in other patches in the series
+- Review each patch individually for its own correctness; assume the patch author ordered them correctly
+- When reviewing a patch series, OMIT patches that have no issues. Do not include a patch in your output just to say "no issues found" or to summarize what the patch does. Only include patches where you have actual findings to report.
+
+## Priority Areas (Review These)
+
+### Security & Safety
+- Unsafe code blocks without justification
+- Command injection risks (shell commands, user input)
+- Path traversal vulnerabilities
+- Credential exposure or hard coded secrets
+- Missing input validation on external data
+- Improper error handling that could leak sensitive info
+
+### Correctness Issues
+- Logic errors that could cause panics or incorrect behavior
+- Buffer overflows
+- Race conditions
+- **`volatile` for inter-thread synchronization**: `volatile` does not
+  provide atomicity or memory ordering between threads. Use
+  `rte_atomic_load_explicit()`/`rte_atomic_store_explicit()` with
+  appropriate `rte_memory_order_*` instead. See the Shared Variable
+  Access section under Forbidden Tokens for details.
+- Resource leaks (files, connections, memory)
+- Off-by-one errors or boundary conditions
+- Incorrect error propagation
+- **Use-after-free** (any access to memory after it has been freed)
+- **Error path resource leaks**: For every allocation or fd open,
+  trace each error path (`goto`, early `return`, conditional) to
+  verify the resource is released. Common patterns to check:
+  - `malloc`/`rte_malloc` followed by a failure that does `return -1`
+    instead of `goto cleanup`
+  - `open()`/`socket()` fd not closed on a later error
+  - Lock acquired but not released on an error branch
+  - Partially initialized structure where early fields are allocated
+    but later allocation fails without freeing the early ones
+- **Double-free / double-close**: resource freed in both a normal
+  path and an error path, or fd closed but not set to -1 allowing
+  a second close
+- **Missing error checks**: functions that can fail (malloc, open,
+  ioctl, etc.) whose return value is not checked
+- Changes to API without release notes
+- Changes to ABI on non-LTS release
+- Usage of deprecated APIs when replacements exist
+- Overly defensive code that adds unnecessary checks
+- Unnecessary comments that just restate what the code already shows (remove them)
+- **Process-shared synchronization errors** (pthread mutexes in shared memory without `PTHREAD_PROCESS_SHARED`)
+- **`mmap()` checked against NULL instead of `MAP_FAILED`**: `mmap()` returns
+  `MAP_FAILED` (i.e., `(void *)-1`) on failure, NOT `NULL`. Checking
+  `== NULL` or `!= NULL` will miss the error and use an invalid pointer.
+  ```c
+  /* BAD - mmap never returns NULL on failure */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == NULL)       /* WRONG - will not catch MAP_FAILED */
+      return -1;
+
+  /* GOOD */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == MAP_FAILED)
+      return -1;
+  ```
+- **Statistics accumulation using `=` instead of `+=`**: When accumulating
+  statistics (counters, byte totals, packet counts), using `=` overwrites
+  the running total with only the latest value. This silently produces
+  wrong results.
+  ```c
+  /* BAD - overwrites instead of accumulating */
+  stats->rx_packets = nb_rx;
+  stats->rx_bytes = total_bytes;
+
+  /* GOOD - accumulates over time */
+  stats->rx_packets += nb_rx;
+  stats->rx_bytes += total_bytes;
+  ```
+  Note: `=` is correct for gauge-type values (e.g., queue depth, link
+  status) and for initial assignment. Only flag when the context is
+  clearly incremental accumulation (loop bodies, per-burst counters,
+  callback tallies).
+- **Integer multiply without widening cast**: When multiplying integers
+  to produce a result wider than the operands (sizes, offsets, byte
+  counts), the multiplication is performed at the operand width and
+  the upper bits are silently lost before the assignment. This applies
+  to any narrowing scenario: 16×16 assigned to a 32-bit variable,
+  32×32 assigned to a 64-bit variable, etc.
+  ```c
+  /* BAD - 32×32 overflows before widening to 64 */
+  uint64_t total_size = num_entries * entry_size;  /* both are uint32_t */
+  size_t offset = ring->idx * ring->desc_size;     /* 32×32 → truncated */
+
+  /* BAD - 16×16 overflows before widening to 32 */
+  uint32_t byte_count = pkt_len * nb_segs;         /* both are uint16_t */
+
+  /* GOOD - widen before multiply */
+  uint64_t total_size = (uint64_t)num_entries * entry_size;
+  size_t offset = (size_t)ring->idx * ring->desc_size;
+  uint32_t byte_count = (uint32_t)pkt_len * nb_segs;
+  ```
+- **Unbounded descriptor chain traversal**: When walking a chain of
+  descriptors (virtio, DMA, NIC Rx/Tx rings) where the chain length
+  or next-index comes from guest memory or an untrusted API caller,
+  the traversal MUST have a bounds check or loop counter to prevent
+  infinite loops or out-of-bounds access from malicious/corrupt data.
+  ```c
+  /* BAD - guest controls desc[idx].next with no bound */
+  while (desc[idx].flags & VRING_DESC_F_NEXT) {
+      idx = desc[idx].next;          /* guest-supplied, unbounded */
+      process(desc[idx]);
+  }
+
+  /* GOOD - cap iterations to descriptor ring size */
+  for (i = 0; i < ring_size; i++) {
+      if (!(desc[idx].flags & VRING_DESC_F_NEXT))
+          break;
+      idx = desc[idx].next;
+      if (idx >= ring_size)          /* bounds check */
+          return -EINVAL;
+      process(desc[idx]);
+  }
+  ```
+  This applies to any chain/linked-list traversal where indices or
+  pointers originate from untrusted input (guest VMs, user-space
+  callers, network packets).
+- **Bitmask shift using `1` instead of `1ULL` on 64-bit masks**: The
+  literal `1` is `int` (32 bits). Shifting it by 32 or more is
+  undefined behavior; shifting it by less than 32 but assigning to a
+  `uint64_t` silently zeroes the upper 32 bits. Use `1ULL << n`,
+  `UINT64_C(1) << n`, or the DPDK `RTE_BIT64(n)` macro.
+  ```c
+  /* BAD - 1 is int, UB if n >= 32, wrong if result used as uint64_t */
+  uint64_t mask = 1 << bit_pos;
+  if (features & (1 << VIRTIO_NET_F_MRG_RXBUF))  /* bit 15 OK, bit 32+ UB */
+
+  /* GOOD */
+  uint64_t mask = UINT64_C(1) << bit_pos;
+  uint64_t mask = 1ULL << bit_pos;
+  uint64_t mask = RTE_BIT64(bit_pos);        /* preferred in DPDK */
+  if (features & RTE_BIT64(VIRTIO_NET_F_MRG_RXBUF))
+  ```
+  Note: `1U << n` is acceptable when the mask is known to be 32-bit
+  (e.g., `uint32_t` register fields with `n < 32`). Only flag when
+  the result is stored in, compared against, or returned as a 64-bit
+  type, or when `n` could be >= 32.
+- **Left shift of narrow unsigned type sign-extends to 64-bit**: When
+  a `uint8_t` or `uint16_t` value is left-shifted, C integer promotion
+  converts it to `int` (signed 32-bit) before the shift. If the result
+  has bit 31 set, implicit conversion to `uint64_t`, `size_t`, or use
+  in pointer arithmetic sign-extends the upper 32 bits to all-1s,
+  producing a wrong address or value. This is Coverity SIGN_EXTENSION.
+  The fix is to cast the narrow operand to an unsigned type at least as
+  wide as the target before shifting.
+  ```c
+  /* BAD - uint16_t promotes to signed int, bit 31 may set,
+   * then sign-extends when converted to 64-bit for pointer math */
+  uint16_t idx = get_index();
+  void *addr = base + (idx << wqebb_shift);      /* SIGN_EXTENSION */
+  uint64_t off = (uint64_t)(idx << shift);        /* too late: shift already in int */
+
+  /* BAD - uint8_t shift with result used as size_t */
+  uint8_t page_order = get_order();
+  size_t size = page_order << PAGE_SHIFT;          /* promotes to int first */
+
+  /* GOOD - cast before shift */
+  void *addr = base + ((uint64_t)idx << wqebb_shift);
+  uint64_t off = (uint64_t)idx << shift;
+  size_t size = (size_t)page_order << PAGE_SHIFT;
+
+  /* GOOD - intermediate unsigned variable */
+  uint32_t offset = (uint32_t)idx << wqebb_shift;  /* OK if result fits 32 bits */
+  ```
+  Note: This is distinct from the `1 << n` pattern (where the literal
+  `1` is the problem) and from the integer-multiply pattern (where
+  the operation is `*` not `<<`). The mechanism is the same C integer
+  promotion rule, but the code patterns and Coverity checker names
+  differ. Only flag when the shift result is used in a context wider
+  than 32 bits (64-bit assignment, pointer arithmetic, function
+  argument expecting `uint64_t`/`size_t`). A shift whose result is
+  stored in a `uint32_t` or narrower variable is not affected.
+- **Variable overwrite before read (dead store)**: A variable is
+  assigned a value that is unconditionally overwritten before it is
+  ever read. This usually indicates a logic error (wrong variable
+  name, missing `if`, copy-paste mistake) or at minimum is dead code.
+  ```c
+  /* BAD - first assignment is never read */
+  ret = validate_input(cfg);
+  ret = apply_config(cfg);     /* overwrites without checking first ret */
+  if (ret != 0)
+      return ret;
+
+  /* GOOD - check each return value */
+  ret = validate_input(cfg);
+  if (ret != 0)
+      return ret;
+  ret = apply_config(cfg);
+  if (ret != 0)
+      return ret;
+  ```
+  Do NOT flag cases where the initial value is intentionally a default
+  that may or may not be overwritten (e.g., `int ret = 0;` followed
+  by a conditional assignment). Only flag unconditional overwrites
+  where the first value can never be observed.
+- **Shared loop counter in nested loops**: Using the same variable as
+  the loop counter in both an outer and inner loop causes the outer
+  loop to malfunction because the inner loop modifies its counter.
+  ```c
+  /* BAD - inner loop clobbers outer loop counter */
+  int i;
+  for (i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (i = 0; i < nb_descs; i++)    /* BUG: reuses i */
+          init_desc(i);
+  }
+
+  /* GOOD - distinct loop counters */
+  for (int i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (int j = 0; j < nb_descs; j++)
+          init_desc(j);
+  }
+  ```
+- **`memcpy`/`memcmp`/`memset` self-argument (same pointer as both
+  operands)**: Passing the same pointer as both source and destination
+  to `memcpy()` is undefined behavior per C99. Passing the same
+  pointer to both arguments of `memcmp()` is a no-op that always
+  returns 0, indicating a logic error (usually a copy-paste mistake
+  with the wrong variable name). The same applies to `rte_memcpy()`
+  and `memmove()` with identical arguments.
+  ```c
+  /* BAD - memcpy with same src and dst is undefined behavior */
+  memcpy(buf, buf, len);
+  rte_memcpy(dst, dst, len);
+
+  /* BAD - memcmp with same pointer always returns 0 (logic error) */
+  if (memcmp(key, key, KEY_LEN) == 0)  /* always true, wrong variable? */
+
+  /* BAD - likely copy-paste: should be comparing two different MACs */
+  if (memcmp(&eth->src_addr, &eth->src_addr, RTE_ETHER_ADDR_LEN) == 0)
+
+  /* GOOD - comparing two different things */
+  memcpy(dst, src, len);
+  if (memcmp(&eth->src_addr, &eth->dst_addr, RTE_ETHER_ADDR_LEN) == 0)
+  ```
+  This pattern almost always indicates a copy-paste bug where one of
+  the arguments should be a different variable.
+- **`rte_mbuf_raw_free_bulk()` on mixed-pool mbuf arrays**: Tx burst functions
+  and ring/queue dequeue paths receive mbufs that may originate from different
+  mempools (applications are free to send mbufs from any pool).
+  `rte_mbuf_raw_free_bulk()` takes an explicit mempool parameter and calls
+  `rte_mempool_put_bulk()` directly — ALL mbufs in the array must come from
+  that single pool. If mbufs come from different pools, they are returned to
+  the wrong pool, corrupting pool accounting and causing hard-to-debug failures.
+  Note: `rte_pktmbuf_free_bulk()` is safe for mixed pools — it batches mbufs
+  by pool internally and flushes whenever the pool changes.
+  ```c
+  /* BAD - assumes all mbufs are from the same pool */
+  /* (in tx_burst completion or ring dequeue error path) */
+  rte_mbuf_raw_free_bulk(mp, mbufs, nb_mbufs);
+
+  /* GOOD - rte_pktmbuf_free_bulk handles mixed pools correctly */
+  rte_pktmbuf_free_bulk(mbufs, nb_mbufs);
+
+  /* GOOD - free individually (each mbuf returned to its own pool) */
+  for (i = 0; i < nb_mbufs; i++)
+      rte_pktmbuf_free(mbufs[i]);
+  ```
+  This applies to any path that frees mbufs submitted by the application:
+  Tx completion, Tx error cleanup, and ring/queue drain paths.
+  `rte_mbuf_raw_free_bulk()` is an optimization for the fast-free case
+  (`RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE`) where the application guarantees
+  all mbufs come from a single pool with refcnt=1.
+- **MTU confused with Ethernet frame length**: Maximum Transmission Unit
+  (MTU) is the maximum L3 payload size (e.g., 1500 bytes for standard
+  Ethernet). The maximum Ethernet *frame length* includes L2 overhead:
+  Ethernet header (14 bytes) + optional VLAN tags (4 bytes each) + CRC
+  (4 bytes). The overhead varies per device depending on supported
+  encapsulations (VLAN, QinQ, etc.). Confusing MTU with frame length
+  produces off-by-14-to-22-byte errors in packet size limits, buffer
+  sizing, and scattered Rx decisions.
+
+  **VLAN tag accounting:** The outer VLAN tag is L2 overhead and does
+  NOT count toward MTU (matching Linux and FreeBSD). A 1522-byte
+  single-tagged frame is valid at MTU 1500. However, in QinQ the
+  inner (customer) tag DOES consume MTU — it is part of the customer
+  frame. So QinQ with MTU 1500 allows only 1496 bytes of L3 payload
+  unless the port MTU is raised to 1504.
+
+  **Using `rxmode.mtu` after configure:** After `rte_eth_dev_configure()`
+  completes, the canonical MTU is stored in `dev->data->mtu`. The
+  `dev->data->dev_conf.rxmode.mtu` field is the user's *request* and
+  must not be read after configure — it becomes stale if
+  `rte_eth_dev_set_mtu()` is called later. Both configure and set_mtu
+  write to `dev->data->mtu`; PMDs should always read from there.
+
+  **Overhead calculation:** Do not hardcode a single overhead constant.
+  Use the device's own overhead calculation (typically available via
+  `dev_info.max_rx_pktlen - dev_info.max_mtu` or an internal
+  `eth_overhead` field). Different devices support different
+  encapsulations, so the overhead is not a universal constant.
+
+  **Scattered Rx decision:** PMDs compare maximum frame length
+  (MTU + per-device overhead) against Rx buffer size to decide
+  whether scattered Rx is needed. Comparing raw MTU against buffer
+  size is wrong — it underestimates the actual frame size by the
+  overhead.
+  ```c
+  /* BAD - MTU used where frame length is needed */
+  if (dev->data->mtu > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* BAD - hardcoded overhead, wrong for QinQ-capable devices */
+  #define ETHER_OVERHEAD 18  /* may be 22 or 26 for VLAN/QinQ */
+  max_frame = mtu + ETHER_OVERHEAD;
+
+  /* BAD - reading rxmode.mtu after configure (stale if set_mtu called) */
+  static int
+  mydrv_rx_queue_setup(...) {
+      mtu = dev->data->dev_conf.rxmode.mtu;  /* WRONG - may be stale */
+      ...
+  }
+
+  /* GOOD - use dev->data->mtu, the canonical post-configure value */
+  static int
+  mydrv_rx_queue_setup(...) {
+      uint16_t mtu = dev->data->mtu;
+      ...
+  }
+
+  /* GOOD - use per-device overhead for frame length calculation */
+  uint32_t frame_overhead = dev_info.max_rx_pktlen - dev_info.max_mtu;
+  uint32_t max_frame_len = dev->data->mtu + frame_overhead;
+  if (max_frame_len > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* GOOD - device-specific overhead constant derived from capabilities */
+  static uint32_t
+  mydrv_eth_overhead(struct rte_eth_dev *dev) {
+      uint32_t overhead = RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_VLAN)
+          overhead += RTE_VLAN_HLEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_QINQ)
+          overhead += RTE_VLAN_HLEN;
+      return overhead;
+  }
+  ```
+  Note: In `rte_eth_dev_configure()` itself, reading `rxmode.mtu` is
+  correct — that is where the user's request is consumed and written
+  to `dev->data->mtu`. Only flag reads of `rxmode.mtu` *outside*
+  configure (queue setup, start, link update, MTU set, etc.).
+- **Missing scatter Rx for large MTU**: When the configured MTU
+  produces a frame size (MTU + Ethernet overhead) larger than the mbuf
+  data buffer size (`rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`),
+  the PMD MUST either enable scatter Rx (multi-segment receive) or reject
+  the configuration. Silently accepting the MTU and then truncating or
+  dropping oversized packets is a correctness bug.
+  ```c
+  /* BAD - accepts MTU but will truncate packets that don't fit */
+  static int
+  mydrv_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+  {
+      /* No check against mbuf size or scatter capability */
+      dev->data->mtu = mtu;
+      return 0;
+  }
+
+  /* BAD - rejects valid MTU even though scatter is enabled */
+  if (frame_size > mbuf_data_size)
+      return -EINVAL;  /* wrong: should allow if scatter is on */
+
+  /* GOOD - check scatter and mbuf size */
+  if (!dev->data->scattered_rx &&
+      frame_size > dev->data->min_rx_buf_size - RTE_PKTMBUF_HEADROOM)
+      return -EINVAL;
+
+  /* GOOD - auto-enable scatter when needed */
+  if (frame_size > mbuf_data_size) {
+      if (!(dev_info.rx_offload_capa & RTE_ETH_RX_OFFLOAD_SCATTER))
+          return -EINVAL;
+      dev->data->dev_conf.rxmode.offloads |=
+          RTE_ETH_RX_OFFLOAD_SCATTER;
+      dev->data->scattered_rx = 1;
+  }
+  ```
+  Key relationships:
+  - `dev_info.max_rx_pktlen`: maximum frame the hardware can receive
+  - `dev_info.max_mtu`: maximum MTU = `max_rx_pktlen` - overhead
+  - `dev_info.min_rx_bufsize`: minimum Rx buffer the HW requires
+  - `dev_info.max_rx_bufsize`: maximum single-descriptor buffer size
+  - `mbuf data size = rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`
+  - When scatter is off: frame length must fit in a single mbuf
+  - When scatter is on: frame length can span multiple mbufs;
+    the PMD selects a scattered Rx function
+
+  This pattern should be checked in three places:
+  1. `dev_configure()` -- validate MTU against mbuf size / scatter
+  2. `rx_queue_setup()` -- select scattered vs non-scattered Rx path
+  3. `mtu_set()` -- runtime MTU change must re-validate
+- **Rx queue function selection ignoring scatter**: When a PMD has
+  separate fast-path Rx functions for scalar (single-segment) and
+  scattered (multi-segment) modes, it must select the scattered
+  variant whenever `dev->data->scattered_rx` is set OR when the
+  configured frame length exceeds the single mbuf data size.
+  Failing to do so causes the scalar Rx function to silently drop
+  or corrupt multi-segment packets.
+  ```c
+  /* BAD - only checks offload flag, ignores actual need */
+  if (rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER)
+      rx_func = mydrv_recv_scattered;
+  else
+      rx_func = mydrv_recv_single;  /* will drop oversized pkts */
+
+  /* GOOD - check both the flag and the size */
+  mbuf_size = rte_pktmbuf_data_room_size(rxq->mp) -
+              RTE_PKTMBUF_HEADROOM;
+  max_pkt = dev->data->mtu + overhead;
+  if ((rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER) ||
+      max_pkt > mbuf_size) {
+      dev->data->scattered_rx = 1;
+      rx_func = mydrv_recv_scattered;
+  } else {
+      rx_func = mydrv_recv_single;
+  }
+  ```
+
+### Architecture & Patterns
+- Code that violates existing patterns in the code base
+- Missing error handling
+- Code that is not safe against signals
+- **Environment variables used for driver configuration instead of devargs**:
+  Drivers must use DPDK device arguments (`devargs`) for runtime
+  configuration, not environment variables. Devargs are preferred because
+  they are obviously device-specific rather than having global impact,
+  some launch methods strip all environment variables, and devargs can
+  be associated on a per-device basis rather than per-device-type.
+  Use `rte_kvargs_parse()` on the devargs string instead.
+  ```c
+  /* BAD - environment variable for driver tuning */
+  val = getenv("MYDRV_RX_BURST_SIZE");
+  if (val != NULL)
+      burst = atoi(val);
+
+  /* GOOD - devargs parsed at probe time */
+  static const char * const valid_args[] = { "rx_burst_size", NULL };
+  kvlist = rte_kvargs_parse(devargs->args, valid_args);
+  rte_kvargs_process(kvlist, "rx_burst_size", &parse_uint, &burst);
+  ```
+  Note: `getenv()` in EAL itself or in test/example code is acceptable.
+  This rule applies to libraries under `lib/` and drivers under `drivers/`.
+
+### New Library API Design
+
+When a patch adds a new library under `lib/`, review API design in
+addition to correctness and style.
+
+**API boundary.** A library should be a compiler, not a framework.
+The model is `rte_acl`: create a context, feed input, get structured
+output, caller decides what to do with it. No callbacks needed. If
+the library requires callers to implement a callback table to
+function, the boundary is wrong — the library is asking the caller
+to be its backend.
+
+**Callback structs** (Warning / Error). Any function-pointer struct
+in an installed header is an ABI break waiting to happen. Adding or
+reordering a member breaks all consumers.
+- Prefer a single callback parameter over an ops table.
+- \>5 callbacks: **Warning** — likely needs redesign.
+- \>20 callbacks: **Error** — this is an app plugin API, not a library.
+- All callbacks must have Doxygen (contract, return values, ownership).
+- Void-returning callbacks for failable operations swallow errors —
+  flag as **Error**.
+- Callbacks serving app-specific needs (e.g. `verbose_level_get`)
+  indicate wrong code was extracted into the library.
+
+**Extensible structures.** Prefer TLV / tagged-array patterns over
+enum + union, following `rte_flow_item` and `rte_flow_action` as
+the model. Type tag + pointer to type-specific data allows adding
+types without ABI breaks. Flag as **Warning**:
+- Large enums (100+) consumers must switch on.
+- Unions that grow with every new feature.
+- Ask: "What changes when a feature is added next release?" If
+  "add an enum value and union arm" — should be TLV.
+
+**Installed headers.** If it's in `headers` or `indirect_headers`
+in meson.build, it's public API. Don't call it "private." If truly
+internal, don't install it.
+
+**Global state.** Prefer handle-based APIs (`create`/`destroy`)
+over singletons. `rte_acl` allows multiple independent classifier
+instances; new libraries should do the same.
+
+**Output ownership.** Prefer caller-allocated or library-allocated-
+caller-freed over internal static buffers. If static buffers are
+used, document lifetime and ensure Doxygen examples don't show
+stale-pointer usage.
+
+---
+
+## C Coding Style
+
+### General Formatting
+
+- **Tab width**: 8 characters (hard tabs for indentation, spaces for alignment)
+- **No trailing whitespace** on lines or at end of files
+- Files must end with a new line
+- Code style should be consistent within each file
+
+
+### Comments
+
+```c
+/* Most single-line comments look like this. */
+
+/*
+ * VERY important single-line comments look like this.
+ */
+
+/*
+ * Multi-line comments look like this. Make them real sentences. Fill
+ * them so they look like real paragraphs.
+ */
+```
+
+### Header File Organization
+
+Include order (each group separated by blank line):
+1. System/libc includes
+2. DPDK EAL includes
+3. DPDK misc library includes
+4. Application-specific includes
+
+```c
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <rte_eal.h>
+
+#include <rte_ring.h>
+#include <rte_mempool.h>
+
+#include "application.h"
+```
+
+### Header Guards
+
+```c
+#ifndef _FILE_H_
+#define _FILE_H_
+
+/* Code */
+
+#endif /* _FILE_H_ */
+```
+
+### Naming Conventions
+
+- **All external symbols** must have `RTE_` or `rte_` prefix
+- **Macros**: ALL_UPPERCASE with `RTE_` prefix
+- **Functions**: lowercase with underscores only (no CamelCase)
+- **Variables**: lowercase with underscores only
+- **Enum values**: ALL_UPPERCASE with `RTE_<ENUM>_` prefix
+
+**Exception**: Driver base directories (`drivers/*/base/`) may use different
+naming conventions when sharing code across platforms or with upstream vendor code.
+
+#### Symbol Naming for Static Linking
+
+Drivers and libraries must not expose global variables that could
+clash when statically linked with other DPDK components or
+applications. Use consistent and unique prefixes for all exported
+symbols to avoid namespace collisions.
+
+**Good practice**: Use a driver-specific or library-specific prefix for all global variables:
+
+```c
+/* Good - virtio driver uses consistent "virtio_" prefix */
+const struct virtio_ops virtio_legacy_ops = {
+	.read = virtio_legacy_read,
+	.write = virtio_legacy_write,
+	.configure = virtio_legacy_configure,
+};
+
+const struct virtio_ops virtio_modern_ops = {
+	.read = virtio_modern_read,
+	.write = virtio_modern_write,
+	.configure = virtio_modern_configure,
+};
+
+/* Good - mlx5 driver uses consistent "mlx5_" prefix */
+struct mlx5_flow_driver_ops mlx5_flow_dv_ops;
+```
+
+**Bad practice**: Generic names that may clash:
+
+```c
+/* Bad - "ops" is too generic, will clash with other drivers */
+const struct virtio_ops ops = { ... };
+
+/* Bad - "legacy_ops" could clash with other legacy implementations */
+const struct virtio_ops legacy_ops = { ... };
+
+/* Bad - "driver_config" is not unique */
+struct driver_config config;
+```
+
+**Guidelines**:
+- Prefix all global variables with the driver or library name (e.g., `virtio_`, `mlx5_`, `ixgbe_`)
+- Prefix all global functions similarly unless they use the `rte_` namespace
+- Internal static variables do not require prefixes as they have file scope
+- Consider using the `RTE_` or `rte_` prefix only for symbols that are part of the public DPDK API
+
+#### Prohibited Terminology
+
+Do not use non-inclusive naming including:
+- `master/slave` -> Use: primary/secondary, controller/worker, leader/follower
+- `blacklist/whitelist` -> Use: denylist/allowlist, blocklist/passlist
+- `cripple` -> Use: impacted, degraded, restricted, immobilized
+- `tribe` -> Use: team, squad
+- `sanity check` -> Use: coherence check, test, verification
+
+
+### Comparisons and Boolean Logic
+
+```c
+/* Pointers - compare explicitly with NULL */
+if (p == NULL)      /* Good */
+if (p != NULL)      /* Good */
+if (likely(p != NULL))   /* Good - likely/unlikely don't change this */
+if (unlikely(p == NULL)) /* Good - likely/unlikely don't change this */
+if (!p)             /* Bad - don't use ! on pointers */
+
+/* Integers - compare explicitly with zero */
+if (a == 0)         /* Good */
+if (a != 0)         /* Good */
+if (errno != 0)     /* Good - this IS explicit */
+if (likely(a != 0)) /* Good - likely/unlikely don't change this */
+if (!a)             /* Bad - don't use ! on integers */
+if (a)              /* Bad - implicit, should be a != 0 */
+
+/* Characters - compare with character constant */
+if (*p == '\0')     /* Good */
+
+/* Booleans - direct test is acceptable */
+if (flag)           /* Good for actual bool types */
+if (!flag)          /* Good for actual bool types */
+```
+
+**Explicit comparison** means using `==` or `!=` operators (e.g., `x != 0`, `p == NULL`).
+**Implicit comparison** means relying on truthiness without an operator (e.g., `if (x)`, `if (!p)`).
+**Note**: `likely()` and `unlikely()` macros do NOT affect whether a comparison is explicit or implicit.
+
+### Boolean Usage
+
+Prefer `bool` (from `<stdbool.h>`) over `int` for variables,
+parameters, and return values that are purely true/false. Using
+`bool` makes intent explicit, enables compiler diagnostics for
+misuse, and is self-documenting.
+
+```c
+/* Bad - int used as boolean flag */
+int verbose = 0;
+int is_enabled = 1;
+
+int
+check_valid(struct item *item)
+{
+	if (item->flags & ITEM_VALID)
+		return 1;
+	return 0;
+}
+
+/* Good - bool communicates intent */
+bool verbose = false;
+bool is_enabled = true;
+
+bool
+check_valid(struct item *item)
+{
+	return item->flags & ITEM_VALID;
+}
+```
+
+**Guidelines:**
+- Use `bool` for variables that only hold true/false values
+- Use `bool` return type for predicate functions (functions that
+  answer a yes/no question, often named `is_*`, `has_*`, `can_*`)
+- Use `true`/`false` rather than `1`/`0` for boolean assignments
+- Boolean variables and parameters should not use explicit
+  comparison: `if (verbose)` is correct, not `if (verbose == true)`
+- `int` is still appropriate when a value can be negative, is an
+  error code, or carries more than two states
+
+**Structure fields:**
+- `bool` occupies 1 byte. In packed or cache-critical structures,
+  consider using a bitfield or flags word instead
+- For configuration structures and non-hot-path data, `bool` is
+  preferred over `int` for flag fields
+
+```c
+/* Bad - int flags waste space and obscure intent */
+struct port_config {
+	int promiscuous;     /* 0 or 1 */
+	int link_up;         /* 0 or 1 */
+	int autoneg;         /* 0 or 1 */
+	uint16_t mtu;
+};
+
+/* Good - bool for flag fields */
+struct port_config {
+	bool promiscuous;
+	bool link_up;
+	bool autoneg;
+	uint16_t mtu;
+};
+
+/* Also good - bitfield for cache-critical structures */
+struct fast_path_config {
+	uint32_t flags;      /* bitmask of CONFIG_F_* */
+	/* ... hot-path fields ... */
+};
+```
+
+**Do NOT flag:**
+- `int` return type for functions that return error codes (0 for
+  success, negative for error) — these are NOT boolean
+- `int` used for tri-state or multi-state values
+- `int` flags in existing code where changing the type would be a
+  large, unrelated refactor
+- Bitfield or flags-word approaches in performance-critical
+  structures
+
+### Indentation and Braces
+
+```c
+/* Control statements - no braces for single statements */
+if (val != NULL)
+	val = realloc(val, newsize);
+
+/* Braces on same line as else */
+if (test)
+	stmt;
+else if (bar) {
+	stmt;
+	stmt;
+} else
+	stmt;
+
+/* Switch statements - don't indent case */
+switch (ch) {
+case 'a':
+	aflag = 1;
+	/* FALLTHROUGH */
+case 'b':
+	bflag = 1;
+	break;
+default:
+	usage();
+}
+
+/* Long conditions - double indent continuation */
+if (really_long_variable_name_1 == really_long_variable_name_2 &&
+		really_long_variable_name_3 == really_long_variable_name_4)
+	stmt;
+```
+
+### Variable Declarations
+
+- Prefer declaring variables inside the basic block where they are used
+- Variables may be declared either at the start of the block, or at point of first use (C99 style)
+- Both declaration styles are acceptable; consistency within a function is preferred
+- Initialize variables only when a meaningful value exists at declaration time
+- Use C99 designated initializers for structures
+
+```c
+/* Good - declaration at start of block */
+int ret;
+ret = some_function();
+
+/* Also good - declaration at point of use (C99 style) */
+for (int i = 0; i < count; i++)
+	process(i);
+
+/* Good - declaration in inner block where variable is used */
+if (condition) {
+	int local_val = compute();
+	use(local_val);
+}
+
+/* Bad - unnecessary initialization defeats compiler warnings */
+int ret = 0;
+ret = some_function();    /* Compiler won't warn if assignment removed */
+```
+
+### Function Format
+
+- Return type on its own line
+- Opening brace on its own line
+- Place an empty line between declarations and statements
+
+```c
+static char *
+function(int a1, int b1)
+{
+	char *p;
+
+	p = do_something(a1, b1);
+	return p;
+}
+```
+
+---
+
+## Unnecessary Code Patterns
+
+The following patterns add unnecessary code, hide bugs, or reduce performance. Avoid them.
+
+### Unnecessary Variable Initialization
+
+Do not initialize variables that will be assigned before use. This defeats the compiler's uninitialized variable warnings, hiding potential bugs.
+
+```c
+/* Bad - initialization defeats -Wuninitialized */
+int ret = 0;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - compiler will warn if any path misses assignment */
+int ret;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - meaningful initial value */
+int count = 0;
+for (i = 0; i < n; i++)
+	if (test(i))
+		count++;
+```
+
+### Unnecessary Casts of void *
+
+In C, `void *` converts implicitly to any pointer type. Casting the result of `malloc()`, `calloc()`, `rte_malloc()`, or similar functions is unnecessary and can hide the error of a missing `#include <stdlib.h>`.
+
+```c
+/* Bad - unnecessary cast */
+struct foo *p = (struct foo *)malloc(sizeof(*p));
+struct bar *q = (struct bar *)rte_malloc(NULL, sizeof(*q), 0);
+
+/* Good - no cast needed in C */
+struct foo *p = malloc(sizeof(*p));
+struct bar *q = rte_malloc(NULL, sizeof(*q), 0);
+```
+
+Note: Casts are required in C++ but DPDK is a C project.
+
+### Zero-Length Arrays vs Variable-Length Arrays
+
+Zero-length arrays (`int arr[0]`) are a GCC extension. Use C99 flexible array members instead.
+
+```c
+/* Bad - GCC extension */
+struct msg {
+	int len;
+	char data[0];
+};
+
+/* Good - C99 flexible array member */
+struct msg {
+	int len;
+	char data[];
+};
+```
+
+### Unnecessary NULL Checks Before free()
+
+Functions like `free()`, `rte_free()`, and similar deallocation functions accept NULL pointers safely. Do not add redundant NULL checks.
+
+```c
+/* Bad - unnecessary check */
+if (ptr != NULL)
+	free(ptr);
+
+if (rte_ptr != NULL)
+	rte_free(rte_ptr);
+
+/* Good - free handles NULL */
+free(ptr);
+rte_free(rte_ptr);
+```
+
+### memset Before free() (CWE-14)
+
+Do not call `memset()` to zero memory before freeing it. The compiler may optimize away the `memset()` as a dead store (CWE-14: Compiler Removal of Code to Clear Buffers). For security-sensitive data, use `explicit_bzero()`, `rte_memset_sensitive()`, or `rte_free_sensitive()` which the compiler is not permitted to eliminate.
+
+```c
+/* Bad - compiler may eliminate memset */
+memset(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for non-sensitive data, just free */
+free(ptr);
+
+/* Good - explicit_bzero cannot be optimized away */
+explicit_bzero(secret_key, sizeof(secret_key));
+free(secret_key);
+
+/* Good - DPDK wrapper for clearing sensitive data */
+rte_memset_sensitive(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for rte_malloc'd sensitive data, combined clear+free */
+rte_free_sensitive(secret_key);
+```
+
+### Appropriate Use of rte_malloc()
+
+`rte_malloc()` allocates from hugepage memory. Use it only when required:
+
+- Memory that will be accessed by DMA (NIC descriptors, packet buffers)
+- Memory shared between primary and secondary DPDK processes
+- Memory requiring specific NUMA node placement
+
+For general allocations, use standard `malloc()` which is faster and does not consume limited hugepage resources.
+
+```c
+/* Bad - rte_malloc for ordinary data structure */
+struct config *cfg = rte_malloc(NULL, sizeof(*cfg), 0);
+
+/* Good - standard malloc for control structures */
+struct config *cfg = malloc(sizeof(*cfg));
+
+/* Good - rte_malloc for DMA-accessible memory */
+struct rte_mbuf *mbufs = rte_malloc(NULL, n * sizeof(*mbufs), RTE_CACHE_LINE_SIZE);
+```
+
+### Appropriate Use of rte_memcpy()
+
+`rte_memcpy()` is optimized for bulk data transfer in the fast path. For general use, standard `memcpy()` is preferred because:
+
+- Modern compilers optimize `memcpy()` effectively
+- `memcpy()` includes bounds checking with `_FORTIFY_SOURCE`
+- `memcpy()` handles small fixed-size copies efficiently
+
+```c
+/* Bad - rte_memcpy in control path */
+rte_memcpy(&config, &default_config, sizeof(config));
+
+/* Good - standard memcpy for control path */
+memcpy(&config, &default_config, sizeof(config));
+
+/* Good - rte_memcpy for packet data in fast path */
+rte_memcpy(rte_pktmbuf_mtod(m, void *), payload, len);
+```
+
+### Non-const Function Pointer Arrays
+
+Arrays of function pointers (ops tables, dispatch tables, callback arrays)
+should be declared `const` when their contents are fixed at compile time.
+A non-`const` function pointer array can be overwritten by bugs or exploits,
+and prevents the compiler from placing the table in read-only memory.
+
+```c
+/* Bad - mutable when it doesn't need to be */
+static rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+
+/* Good - immutable dispatch table */
+static const rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+```
+
+**Exceptions** (do NOT flag):
+- Arrays modified at runtime for CPU feature detection or capability probing
+  (e.g., selecting a burst function based on `rte_cpu_get_flag_enabled()`)
+- Arrays containing mutable state (e.g., entries that are linked into lists)
+- Arrays populated dynamically via registration APIs
+- `dev_ops` or similar structures assigned per-device at init time
+
+Only flag when the array is fully initialized at declaration with constant
+values and never modified thereafter.
+
+---
+
+## Forbidden Tokens
+
+### Functions
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `rte_panic()` | Return error codes | lib/, drivers/ |
+| `rte_exit()` | Return error codes | lib/, drivers/ |
+| `perror()` | `RTE_LOG()` with `strerror(errno)` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `printf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `fprintf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `getenv()` | `rte_kvargs_parse()` / devargs | drivers/ (allowed in EAL, examples/, app/test/) |
+
+### Atomics and Memory Barriers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `rte_atomic16/32/64_xxx()` | C11 atomics via `rte_atomic_xxx()` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence()` |
+| `__sync_xxx()` | `rte_atomic_xxx()` |
+| `__atomic_xxx()` | `rte_atomic_xxx()` |
+| `__ATOMIC_RELAXED` etc. | `rte_memory_order_xxx` |
+| `__rte_atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+
+#### Shared Variable Access: volatile vs Atomics
+
+Variables shared between threads or between a thread and a signal
+handler **must** use atomic operations. The C `volatile` keyword is
+NOT a substitute for atomics — it prevents compiler optimization
+of accesses but provides no atomicity guarantees and no memory
+ordering between threads. On some architectures, `volatile` reads
+and writes may tear on unaligned or multi-word values.
+
+DPDK provides C11 atomic wrappers that are portable across all
+supported compilers and architectures. Always use these for shared
+state.
+
+**Reading shared variables:**
+
+```c
+/* BAD - volatile provides no atomicity or ordering guarantee */
+volatile int stop_flag;
+if (stop_flag)           /* data race, compiler/CPU can reorder */
+    return;
+
+/* BAD - direct access to shared variable without atomic */
+if (shared->running)     /* undefined behavior if another thread writes */
+    process();
+
+/* GOOD - DPDK C11 atomic wrapper */
+if (rte_atomic_load_explicit(&shared->stop_flag, rte_memory_order_acquire))
+    return;
+
+/* GOOD - relaxed is fine for statistics or polling a flag where
+ * you don't need to synchronize other memory accesses */
+count = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+```
+
+**Writing shared variables:**
+
+```c
+/* BAD - volatile write */
+volatile int *flag = &shared->ready;
+*flag = 1;
+
+/* GOOD - atomic store with appropriate ordering */
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+```
+
+**Read-modify-write operations:**
+
+```c
+/* BAD - not atomic even with volatile */
+volatile uint64_t *counter = &stats->packets;
+*counter += nb_rx;       /* TOCTOU: load, add, store is 3 operations */
+
+/* GOOD - atomic add */
+rte_atomic_fetch_add_explicit(&stats->packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Forbidden Atomic APIs in New Code
+
+New code **must not** use GCC/Clang `__atomic_*` built-ins or the
+legacy DPDK `rte_smp_*mb()` barriers. These are deprecated and
+will be removed. Use the DPDK C11 atomic wrappers instead.
+
+**GCC/Clang `__atomic_*` built-ins — do not use:**
+
+```c
+/* BAD - GCC built-in, not portable, not DPDK API */
+val = __atomic_load_n(&shared->count, __ATOMIC_RELAXED);
+__atomic_store_n(&shared->flag, 1, __ATOMIC_RELEASE);
+__atomic_fetch_add(&shared->counter, 1, __ATOMIC_RELAXED);
+__atomic_compare_exchange_n(&shared->state, &expected, desired,
+    0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+__atomic_thread_fence(__ATOMIC_SEQ_CST);
+
+/* GOOD - DPDK C11 atomic wrappers */
+val = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+rte_atomic_fetch_add_explicit(&shared->counter, 1, rte_memory_order_relaxed);
+rte_atomic_compare_exchange_strong_explicit(&shared->state, &expected, desired,
+    rte_memory_order_acq_rel, rte_memory_order_acquire);
+rte_atomic_thread_fence(rte_memory_order_seq_cst);
+```
+
+Similarly, do not use `__sync_*` built-ins (`__sync_fetch_and_add`,
+`__sync_bool_compare_and_swap`, etc.) — these are the older GCC
+atomics with implicit full barriers and are even less appropriate
+than `__atomic_*`.
+
+**Legacy DPDK barriers — do not use:**
+
+```c
+/* BAD - legacy DPDK barriers, deprecated */
+rte_smp_mb();            /* full memory barrier */
+rte_smp_rmb();           /* read memory barrier */
+rte_smp_wmb();           /* write memory barrier */
+
+/* GOOD - C11 fence with explicit ordering */
+rte_atomic_thread_fence(rte_memory_order_seq_cst);   /* replaces rte_smp_mb() */
+rte_atomic_thread_fence(rte_memory_order_acquire);    /* replaces rte_smp_rmb() */
+rte_atomic_thread_fence(rte_memory_order_release);    /* replaces rte_smp_wmb() */
+
+/* BETTER - use ordering on the atomic operation itself when possible */
+val = rte_atomic_load_explicit(&shared->flag, rte_memory_order_acquire);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+The legacy `rte_atomic16/32/64_*()` type-specific functions (e.g.,
+`rte_atomic32_inc()`, `rte_atomic64_read()`) are also deprecated.
+Use `rte_atomic_fetch_add_explicit()`, `rte_atomic_load_explicit()`,
+etc. with standard C integer types.
+
+| Deprecated API | Replacement |
+|----------------|-------------|
+| `__atomic_load_n()` | `rte_atomic_load_explicit()` |
+| `__atomic_store_n()` | `rte_atomic_store_explicit()` |
+| `__atomic_fetch_add()` | `rte_atomic_fetch_add_explicit()` |
+| `__atomic_compare_exchange_n()` | `rte_atomic_compare_exchange_strong_explicit()` |
+| `__atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+| `__ATOMIC_RELAXED` | `rte_memory_order_relaxed` |
+| `__ATOMIC_ACQUIRE` | `rte_memory_order_acquire` |
+| `__ATOMIC_RELEASE` | `rte_memory_order_release` |
+| `__ATOMIC_ACQ_REL` | `rte_memory_order_acq_rel` |
+| `__ATOMIC_SEQ_CST` | `rte_memory_order_seq_cst` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence(rte_memory_order_seq_cst)` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence(rte_memory_order_acquire)` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence(rte_memory_order_release)` |
+| `rte_atomic32_inc(&v)` | `rte_atomic_fetch_add_explicit(&v, 1, rte_memory_order_relaxed)` |
+| `rte_atomic64_read(&v)` | `rte_atomic_load_explicit(&v, rte_memory_order_relaxed)` |
+
+#### Memory Ordering Guide
+
+Use the weakest ordering that is correct. Stronger ordering
+constrains hardware and compiler optimization unnecessarily.
+
+| DPDK Ordering | When to Use |
+|---------------|-------------|
+| `rte_memory_order_relaxed` | Statistics counters, polling flags where no other data depends on the value. Most common for simple counters. |
+| `rte_memory_order_acquire` | **Load** side of a flag/pointer that guards access to other shared data. Ensures subsequent reads see data published by the releasing thread. |
+| `rte_memory_order_release` | **Store** side of a flag/pointer that publishes shared data. Ensures all prior writes are visible to a thread that does an acquire load. |
+| `rte_memory_order_acq_rel` | Read-modify-write operations (e.g., `fetch_add`) that both consume and publish shared state in one operation. |
+| `rte_memory_order_seq_cst` | Rarely needed. Only when multiple independent atomic variables must be observed in a globally consistent total order. Avoid unless required. |
+
+**Common pattern — producer/consumer flag:**
+
+```c
+/* Producer thread: fill buffer, then signal ready */
+fill_buffer(buf, data, len);
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+
+/* Consumer thread: wait for flag, then read buffer */
+while (!rte_atomic_load_explicit(&shared->ready, rte_memory_order_acquire))
+    rte_pause();
+process_buffer(buf, len);  /* guaranteed to see producer's writes */
+```
+
+**Common pattern — statistics counter (no ordering needed):**
+
+```c
+rte_atomic_fetch_add_explicit(&port_stats->rx_packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Standalone Fences
+
+Prefer ordering on the atomic operation itself (acquire load,
+release store) over standalone fences. Standalone fences
+(`rte_atomic_thread_fence()`) are a blunt instrument that
+orders ALL memory accesses around the fence, not just the
+atomic variable you care about.
+
+```c
+/* Acceptable but less precise - standalone fence */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_relaxed);
+rte_atomic_thread_fence(rte_memory_order_release);
+
+/* Preferred - ordering on the operation itself */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+Standalone fences are appropriate when synchronizing multiple
+non-atomic writes (e.g., filling a structure before publishing
+a pointer to it) where annotating each write individually is
+impractical.
+
+#### When volatile Is Still Acceptable
+
+`volatile` remains correct for:
+- Memory-mapped I/O registers (hardware MMIO)
+- Variables shared with signal handlers in single-threaded contexts
+- Interaction with `setjmp`/`longjmp`
+
+`volatile` is NOT correct for:
+- Any variable accessed by multiple threads
+- Polling flags between lcores
+- Statistics counters updated from multiple threads
+- Flags set by one thread and read by another
+
+**Do NOT flag** `volatile` used for MMIO or hardware register access
+(common in drivers under `drivers/*/base/`).
+
+### Threading
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `pthread_create()` | `rte_thread_create()` |
+| `pthread_join()` | `rte_thread_join()` |
+| `pthread_detach()` | EAL thread functions |
+| `pthread_setaffinity_np()` | `rte_thread_set_affinity()` |
+| `rte_thread_set_name()` | `rte_thread_set_prefixed_name()` |
+| `rte_thread_create_control()` | `rte_thread_create_internal_control()` |
+
+### Process-Shared Synchronization
+
+When placing synchronization primitives in shared memory (memory accessible by multiple processes, such as DPDK primary/secondary processes or `mmap`'d regions), they **must** be initialized with process-shared attributes. Failure to do so causes **undefined behavior** that may appear to work in testing but fail unpredictably in production.
+
+#### pthread Mutexes in Shared Memory
+
+**This is an error** - mutex in shared memory without `PTHREAD_PROCESS_SHARED`:
+
+```c
+/* BAD - undefined behavior when used across processes */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+void init_shared(struct shared_data *shm) {
+	pthread_mutex_init(&shm->lock, NULL);  /* ERROR: missing pshared attribute */
+}
+```
+
+**Correct implementation**:
+
+```c
+/* GOOD - properly initialized for cross-process use */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+int init_shared(struct shared_data *shm) {
+	pthread_mutexattr_t attr;
+	int ret;
+
+	ret = pthread_mutexattr_init(&attr);
+	if (ret != 0)
+		return -ret;
+
+	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+	if (ret != 0) {
+		pthread_mutexattr_destroy(&attr);
+		return -ret;
+	}
+
+	ret = pthread_mutex_init(&shm->lock, &attr);
+	pthread_mutexattr_destroy(&attr);
+
+	return -ret;
+}
+```
+
+#### pthread Condition Variables in Shared Memory
+
+Condition variables also require the process-shared attribute:
+
+```c
+/* BAD - will not work correctly across processes */
+pthread_cond_init(&shm->cond, NULL);
+
+/* GOOD */
+pthread_condattr_t cattr;
+pthread_condattr_init(&cattr);
+pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+pthread_cond_init(&shm->cond, &cattr);
+pthread_condattr_destroy(&cattr);
+```
+
+#### pthread Read-Write Locks in Shared Memory
+
+```c
+/* BAD */
+pthread_rwlock_init(&shm->rwlock, NULL);
+
+/* GOOD */
+pthread_rwlockattr_t rwattr;
+pthread_rwlockattr_init(&rwattr);
+pthread_rwlockattr_setpshared(&rwattr, PTHREAD_PROCESS_SHARED);
+pthread_rwlock_init(&shm->rwlock, &rwattr);
+pthread_rwlockattr_destroy(&rwattr);
+```
+
+#### When to Flag This Issue
+
+Flag as an **Error** when ALL of the following are true:
+1. A `pthread_mutex_t`, `pthread_cond_t`, `pthread_rwlock_t`, or `pthread_barrier_t` is initialized
+2. The primitive is stored in shared memory (identified by context such as: structure in `rte_malloc`/`rte_memzone`, `mmap`'d memory, memory passed to secondary processes, or structures documented as shared)
+3. The initialization uses `NULL` attributes or attributes without `PTHREAD_PROCESS_SHARED`
+
+**Do NOT flag** when:
+- The mutex is in thread-local or process-private heap memory (`malloc`)
+- The mutex is a local/static variable not in shared memory
+- The code already uses `pthread_mutexattr_setpshared()` with `PTHREAD_PROCESS_SHARED`
+- The synchronization uses DPDK primitives (`rte_spinlock_t`, `rte_rwlock_t`) which are designed for shared memory
+
+#### Preferred Alternatives
+
+For DPDK code, prefer DPDK's own synchronization primitives which are designed for shared memory:
+
+| pthread Primitive | DPDK Alternative |
+|-------------------|------------------|
+| `pthread_mutex_t` | `rte_spinlock_t` (busy-wait) or properly initialized pthread mutex |
+| `pthread_rwlock_t` | `rte_rwlock_t` |
+| `pthread_spinlock_t` | `rte_spinlock_t` |
+
+Note: `rte_spinlock_t` and `rte_rwlock_t` work correctly in shared memory without special initialization, but they are spinning locks unsuitable for long wait times.
+
+### Compiler Built-ins and Attributes
+
+| Forbidden | Preferred | Notes |
+|-----------|-----------|-------|
+| `__attribute__` | RTE macros in `rte_common.h` | Except in `lib/eal/include/rte_common.h` |
+| `__alignof__` | C11 `alignof` | |
+| `__typeof__` | `typeof` | |
+| `__builtin_*` | EAL macros | Except in `lib/eal/` and `drivers/*/base/` |
+| `__reserved` | Different name | Reserved in Windows headers |
+| `#pragma` / `_Pragma` | Avoid | Except in `rte_common.h` |
+
+### Format Specifiers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `%lld`, `%llu`, `%llx` | `%PRId64`, `%PRIu64`, `%PRIx64` |
+
+### Headers and Build
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `#include <linux/pci_regs.h>` | `#include <rte_pci.h>` | |
+| `install_headers()` | Meson `headers` variable | meson.build |
+| `-DALLOW_EXPERIMENTAL_API` | Not in lib/drivers/app | Build flags |
+| `allow_experimental_apis` | Not in lib/drivers/app | Meson |
+| `#undef XXX` | `// XXX is not set` | config/rte_config.h |
+| Driver headers (`*_driver.h`, `*_pmd.h`) | Public API headers | app/, examples/ |
+
+### Testing
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `REGISTER_TEST_COMMAND` | `REGISTER_<suite_name>_TEST` |
+
+### Documentation
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `http://...dpdk.org` | `https://...dpdk.org` |
+| `//doc.dpdk.org/guides/...` | `:ref:` or `:doc:` Sphinx references |
+| `::  file.svg` | `::  file.*` (wildcard extension) |
+
+---
+
+## Deprecated API Usage
+
+New patches must not introduce usage of deprecated APIs, macros, or functions.
+Deprecated items are marked with `RTE_DEPRECATED` or documented in the
+deprecation notices section of the release notes.
+
+### Rules for New Code
+
+- Do not call functions marked with `RTE_DEPRECATED` or `__rte_deprecated`
+- Do not use macros that have been superseded by newer alternatives
+- Do not use data structures or enum values marked as deprecated
+- Check `doc/guides/rel_notes/deprecation.rst` for planned deprecations
+- When a deprecated API has a replacement, use the replacement
+
+### Deprecating APIs
+
+A patch may mark an API as deprecated provided:
+
+- No remaining usages exist in the current DPDK codebase
+- The deprecation is documented in the release notes
+- A migration path or replacement API is documented
+- The `RTE_DEPRECATED` macro is used to generate compiler warnings
+
+```c
+/* Marking a function as deprecated */
+__rte_deprecated
+int
+rte_old_function(void);
+
+/* With a message pointing to the replacement */
+__rte_deprecated_msg("use rte_new_function() instead")
+int
+rte_old_function(void);
+```
+
+### Common Deprecated Patterns
+
+| Deprecated | Replacement | Notes |
+|-----------|-------------|-------|
+| `rte_atomic*_t` types | C11 atomics | Use `rte_atomic_xxx()` wrappers |
+| `rte_smp_*mb()` barriers | `rte_atomic_thread_fence()` | See Atomics section |
+| `pthread_*()` in portable code | `rte_thread_*()` | See Threading section |
+
+When reviewing patches that add new code, flag any usage of deprecated APIs
+as requiring change to use the modern replacement.
+
+---
+
+## API Tag Requirements
+
+### `__rte_experimental`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_experimental
+int
+rte_new_feature(void);
+
+/* Wrong - not alone on line */
+__rte_experimental int rte_new_feature(void);
+
+/* Wrong - in .c file */
+```
+
+### `__rte_internal`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_internal
+int
+internal_function(void);
+```
+
+### Alignment Attributes
+
+`__rte_aligned`, `__rte_cache_aligned`, `__rte_cache_min_aligned` may only be used with `struct` or `union` types:
+
+```c
+/* Correct */
+struct __rte_cache_aligned my_struct {
+	/* ... */
+};
+
+/* Wrong */
+int __rte_cache_aligned my_variable;
+```
+
+### Packed Attributes
+
+- `__rte_packed_begin` must follow `struct`, `union`, or alignment attributes
+- `__rte_packed_begin` and `__rte_packed_end` must be used in pairs
+- Cannot use `__rte_packed_begin` with `enum`
+
+```c
+/* Correct */
+struct __rte_packed_begin my_packed_struct {
+	/* ... */
+} __rte_packed_end;
+
+/* Wrong - with enum */
+enum __rte_packed_begin my_enum {
+	/* ... */
+};
+```
+
+---
+
+## Code Quality Requirements
+
+### Compilation
+
+- Each commit must compile independently (for `git bisect`)
+- No forward dependencies within a patchset
+- Test with multiple targets, compilers, and options
+- Use `devtools/test-meson-builds.sh`
+
+**Note for AI reviewers**: You cannot verify compilation order or cross-patch dependencies from patch review alone. Do NOT flag patches claiming they "would fail to compile" based on symbols used in other patches in the series. Assume the patch author has ordered them correctly.
+
+### Testing
+
+- Add tests to `app/test` unit test framework
+- New API functions must be used in `/app` test directory
+- New device APIs require at least one driver implementation
+
+#### Functional Test Infrastructure
+
+Standalone functional tests should use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure for consistency and proper integration with the DPDK test framework.
+
+```c
+#include <rte_test.h>
+
+static int
+test_feature_basic(void)
+{
+	int ret;
+
+	ret = rte_feature_init();
+	TEST_ASSERT_SUCCESS(ret, "Failed to initialize feature");
+
+	ret = rte_feature_operation();
+	TEST_ASSERT_EQUAL(ret, 0, "Operation returned unexpected value");
+
+	TEST_ASSERT_NOT_NULL(rte_feature_get_ptr(),
+		"Feature pointer should not be NULL");
+
+	return TEST_SUCCESS;
+}
+
+static struct unit_test_suite feature_testsuite = {
+	.suite_name = "feature_autotest",
+	.setup = test_feature_setup,
+	.teardown = test_feature_teardown,
+	.unit_test_cases = {
+		TEST_CASE(test_feature_basic),
+		TEST_CASE(test_feature_advanced),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_feature(void)
+{
+	return unit_test_suite_runner(&feature_testsuite);
+}
+
+REGISTER_FAST_TEST(feature_autotest, NOHUGE_OK, ASAN_OK, test_feature);
+```
+
+The `REGISTER_FAST_TEST` macro parameters are:
+- Test name (e.g., `feature_autotest`)
+- `NOHUGE_OK` or `HUGEPAGES_REQUIRED` - whether test can run without hugepages
+- `ASAN_OK` or `ASAN_FAILS` - whether test is compatible with Address Sanitizer
+- Test function name
+
+Common `TEST_ASSERT` macros:
+- `TEST_ASSERT(cond, msg, ...)` - Assert condition is true
+- `TEST_ASSERT_SUCCESS(val, msg, ...)` - Assert value equals 0
+- `TEST_ASSERT_FAIL(val, msg, ...)` - Assert value is non-zero
+- `TEST_ASSERT_EQUAL(a, b, msg, ...)` - Assert two values are equal
+- `TEST_ASSERT_NOT_EQUAL(a, b, msg, ...)` - Assert two values differ
+- `TEST_ASSERT_NULL(val, msg, ...)` - Assert value is NULL
+- `TEST_ASSERT_NOT_NULL(val, msg, ...)` - Assert value is not NULL
+
+### Documentation
+
+- Add Doxygen comments for public APIs
+- Update release notes in `doc/guides/rel_notes/` for important changes
+- Code and documentation must be updated atomically in same patch
+- Only update the **current release** notes file
+- Documentation must match the code
+- PMD features must match the features matrix in `doc/guides/nics/features/`
+- Documentation must match device operations (see `doc/guides/nics/features.rst` for the mapping between features, `eth_dev_ops`, and related APIs)
+- Release notes are NOT required for:
+  - Test-only changes (unit tests, functional tests)
+  - Internal APIs and helper functions (not exported to applications)
+  - Internal implementation changes that don't affect public API
+
+### RST Documentation Style
+
+When reviewing `.rst` documentation files, prefer **definition lists**
+over simple bullet lists where each item has a term and a description.
+Definition lists produce better-structured HTML/PDF output and are
+easier to scan.
+
+**When to suggest a definition list:**
+- A bullet list where each item starts with a bold or emphasized term
+  followed by a dash, colon, or long explanation
+- Lists of options, parameters, configuration values, or features
+  where each entry has a name and a description
+- Glossary-style enumerations
+
+**When a simple list is fine (do NOT flag):**
+- Short lists of items without descriptions (e.g., file names, steps)
+- Lists where items are single phrases or sentences with no term/definition structure
+- Enumerated steps in a procedure
+
+**RST definition list syntax:**
+
+```rst
+term 1
+   Description of term 1.
+
+term 2
+   Description of term 2.
+   Can span multiple lines.
+```
+
+**Example — flag this pattern:**
+
+```rst
+* **error** - Fail with error (default)
+* **truncate** - Truncate content to fit token limit
+* **summary** - Request high-level summary review
+```
+
+**Suggest rewriting as:**
+
+```rst
+error
+   Fail with error (default).
+
+truncate
+   Truncate content to fit token limit.
+
+summary
+   Request high-level summary review.
+```
+
+This is a **Warning**-level suggestion, not an Error. Do not flag it
+when the existing list structure is appropriate (see "when a simple
+list is fine" above).
+
+### API and Driver Changes
+
+- New APIs must be marked as `__rte_experimental`
+- New APIs must have hooks in `app/testpmd` and tests in the functional test suite
+- Changes to existing APIs require release notes
+- New drivers or subsystems must have release notes
+- Internal APIs (used only within DPDK, not exported to applications) do NOT require release notes
+
+### ABI Compatibility and Symbol Exports
+
+**IMPORTANT**: DPDK uses automatic symbol map generation. Do **NOT** recommend
+manually editing `version.map` files - they are auto-generated from source code
+annotations.
+
+#### Symbol Export Macros
+
+New public functions must be annotated with export macros (defined in
+`rte_export.h`). Place the macro on the line immediately before the function
+definition in the `.c` file:
+
+```c
+/* For stable ABI symbols */
+RTE_EXPORT_SYMBOL(rte_foo_create)
+int
+rte_foo_create(struct rte_foo_config *config)
+{
+    /* ... */
+}
+
+/* For experimental symbols (include version when first added) */
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_foo_new_feature, 25.03)
+__rte_experimental
+int
+rte_foo_new_feature(void)
+{
+    /* ... */
+}
+
+/* For internal symbols (shared between DPDK components only) */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_foo_internal_helper)
+int
+rte_foo_internal_helper(void)
+{
+    /* ... */
+}
+```
+
+#### Symbol Export Rules
+
+- `RTE_EXPORT_SYMBOL` - Use for stable ABI functions
+- `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, ver)` - Use for new experimental APIs
+  (version is the DPDK release, e.g., `25.03`)
+- `RTE_EXPORT_INTERNAL_SYMBOL` - Use for functions shared between DPDK libs/drivers
+  but not part of public API
+- Export macros go in `.c` files, not headers
+- The build system generates linker version maps automatically
+
+#### What NOT to Review
+
+- Do **NOT** flag missing `version.map` updates - maps are auto-generated
+- Do **NOT** suggest adding symbols to `lib/*/version.map` files
+
+#### ABI Versioning for Changed Functions
+
+When changing the signature of an existing stable function, use versioning macros
+from `rte_function_versioning.h`:
+
+- `RTE_VERSION_SYMBOL` - Create versioned symbol for backward compatibility
+- `RTE_DEFAULT_SYMBOL` - Mark the new default version
+
+Follow ABI policy and versioning guidelines in the contributor documentation.
+Enable ABI checks with `DPDK_ABI_REF_VERSION` environment variable.
+
+---
+
+## LTS (Long Term Stable) Release Review
+
+LTS releases are DPDK versions ending in `.11` (e.g., 23.11, 22.11,
+21.11, 20.11, 19.11). When reviewing patches targeting an LTS branch,
+apply stricter criteria:
+
+### LTS-Specific Rules
+
+- **Only bug fixes allowed** -- no new features
+- **No new APIs** (experimental or stable)
+- **ABI must remain unchanged** -- no symbol additions, removals,
+  or signature changes
+- Backported fixes should reference the original commit with a
+  `Fixes:` tag
+- Copyright years should reflect when the code was originally
+  written
+- Be conservative: reject changes that are not clearly bug fixes
+
+### What to Flag on LTS Branches
+
+**Error:**
+- New feature code (new functions, new driver capabilities)
+- New experimental or stable API additions
+- ABI changes (new or removed symbols, changed function signatures)
+- Changes that add new configuration options or parameters
+
+**Warning:**
+- Large refactoring that goes beyond what is needed for a fix
+- Missing `Fixes:` tag on a backported bug fix
+- Missing `Cc: stable@dpdk.org`
+
+### When LTS Rules Apply
+
+LTS rules apply when the reviewer is told the target release is an
+LTS version (via the `--release` option or equivalent). If no
+release is specified, assume the patch targets the main development
+branch where new features and APIs are allowed.
+
+---
+
+## Patch Validation Checklist
+
+### Commit Message and License
+
+Checked by `devtools/checkpatches.sh` -- not duplicated here.
+
+### Code Style
+
+- [ ] Lines <=100 characters
+- [ ] Hard tabs for indentation, spaces for alignment
+- [ ] No trailing whitespace
+- [ ] Proper include order
+- [ ] Header guards present
+- [ ] `rte_`/`RTE_` prefix on external symbols
+- [ ] Driver/library global variables use unique prefixes (e.g., `virtio_`, `mlx5_`)
+- [ ] No prohibited terminology
+- [ ] Proper brace style
+- [ ] Function return type on own line
+- [ ] Explicit comparisons: `== NULL`, `== 0`, `!= NULL`, `!= 0`
+- [ ] No forbidden tokens (see table above)
+- [ ] No unnecessary code patterns (see section above)
+- [ ] No usage of deprecated APIs, macros, or functions
+- [ ] Process-shared primitives in shared memory use `PTHREAD_PROCESS_SHARED`
+- [ ] `mmap()` return checked against `MAP_FAILED`, not `NULL`
+- [ ] Statistics use `+=` not `=` for accumulation
+- [ ] Integer multiplies widened before operation when result is 64-bit
+- [ ] Descriptor chain traversals bounded by ring size or loop counter
+- [ ] 64-bit bitmasks use `1ULL <<` or `RTE_BIT64()`, not `1 <<`
+- [ ] Left shifts of `uint8_t`/`uint16_t` cast to unsigned target width before shift when result is 64-bit
+- [ ] No unconditional variable overwrites before read
+- [ ] Nested loops use distinct counter variables
+- [ ] No `memcpy`/`memcmp` with identical source and destination pointers
+- [ ] `rte_mbuf_raw_free_bulk()` not used on mixed-pool mbuf arrays (Tx paths, ring dequeue, error paths)
+- [ ] MTU not confused with frame length (MTU = L3 payload, frame = MTU + L2 overhead)
+- [ ] PMDs read `dev->data->mtu` after configure, not `dev_conf.rxmode.mtu`
+- [ ] Ethernet overhead not hardcoded -- derived from device capabilities
+- [ ] Scatter Rx enabled or error returned when frame length exceeds single mbuf data size
+- [ ] `mtu_set` allows large MTU when scatter Rx is active; re-selects Rx burst function
+- [ ] Rx queue setup selects scattered Rx function when frame length exceeds mbuf
+- [ ] Static function pointer arrays declared `const` when contents are compile-time fixed
+- [ ] `bool` used for pure true/false variables, parameters, and predicate return types
+- [ ] Shared variables use `rte_atomic_*_explicit()`, not `volatile` or bare access
+- [ ] No `__atomic_*()` GCC built-ins or `__ATOMIC_*` ordering constants (use `rte_atomic_*_explicit()` and `rte_memory_order_*`)
+- [ ] No `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` (use `rte_atomic_thread_fence()`)
+- [ ] Memory ordering is the weakest correct choice (`relaxed` for counters, `acquire`/`release` for publish/consume)
+- [ ] Sensitive data cleared with `explicit_bzero()`/`rte_free_sensitive()`, not `memset()`
+
+### API Tags
+
+- [ ] `__rte_experimental` alone on line, only in headers
+- [ ] `__rte_internal` alone on line, only in headers
+- [ ] Alignment attributes only on struct/union
+- [ ] Packed attributes properly paired
+- [ ] New public functions have `RTE_EXPORT_*` macro in `.c` file
+- [ ] Experimental functions use `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, version)`
+
+### Structure
+
+- [ ] Each commit compiles independently
+- [ ] Code and docs updated together
+- [ ] Documentation matches code behavior
+- [ ] RST docs use definition lists for term/description patterns
+- [ ] PMD features match `doc/guides/nics/features/` matrix
+- [ ] Device operations match documentation (per `features.rst` mappings)
+- [ ] Tests added/updated as needed
+- [ ] Functional tests use TEST_ASSERT macros and unit_test_suite_runner
+- [ ] New APIs marked as `__rte_experimental`
+- [ ] New APIs have testpmd hooks and functional tests
+- [ ] Current release notes updated for significant changes
+- [ ] Release notes updated for API changes
+- [ ] Release notes updated for new drivers or subsystems
+
+---
+
+## Meson Build Files
+
+### Style Requirements
+
+- 4-space indentation (no tabs)
+- Line continuations double-indented
+- Lists alphabetically ordered
+- Short lists (<=3 items): single line, no trailing comma
+- Long lists: one item per line, trailing comma on last item
+- No strict line length limit for meson files; lines under 100 characters are acceptable
+
+```python
+# Short list
+sources = files('file1.c', 'file2.c')
+
+# Long list
+headers = files(
+	'header1.h',
+	'header2.h',
+	'header3.h',
+)
+```
+
+---
+
+## Python Code
+
+- Must comply with formatting standards
+- Use **`black`** for code formatting validation
+- Line length acceptable up to 100 characters
+
+---
+
+## Validation Tools
+
+Run these before submitting:
+
+```bash
+# Check commit messages
+devtools/check-git-log.sh -n1
+
+# Check patch format and forbidden tokens
+devtools/checkpatches.sh -n1
+
+# Check maintainers coverage
+devtools/check-maintainers.sh
+
+# Build validation
+devtools/test-meson-builds.sh
+
+# Find maintainers for your patch
+devtools/get-maintainer.sh <patch-file>
+```
+
+---
+
+## Severity Levels for AI Review
+
+**Error** (must fix):
+
+*Correctness bugs (highest value findings):*
+- Use-after-free
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference on reachable code path
+- Buffer overflow or out-of-bounds access
+- Missing error check on a function that can fail, leading to undefined behavior
+- Race condition on shared mutable state without synchronization
+- `volatile` used instead of atomics for inter-thread shared variables
+- `__atomic_*()` GCC built-ins in new code (must use `rte_atomic_*_explicit()`)
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` in new code (must use `rte_atomic_thread_fence()`)
+- Error path that skips necessary cleanup
+- `mmap()` return value checked against NULL instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=` (overwrite vs increment)
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied indices
+- `1 << n` used for 64-bit bitmask (undefined behavior if n >= 32)
+- Left shift of `uint8_t`/`uint16_t` used in 64-bit context without widening cast (sign extension)
+- Variable assigned then unconditionally overwritten before read
+- Same variable used as counter in nested loops
+- `memcpy`/`memcmp` with same pointer as both arguments (UB or no-op logic error)
+- `rte_mbuf_raw_free_bulk()` on mbuf array where mbufs may come from different pools (Tx burst, ring dequeue)
+- MTU used where frame length is needed or vice versa (off by L2 overhead)
+- `dev_conf.rxmode.mtu` read after configure instead of `dev->data->mtu` (stale value)
+- MTU accepted without scatter Rx when frame size exceeds single mbuf capacity (silent truncation/drop)
+- `mtu_set` rejects valid MTU when scatter Rx is already enabled
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size comparison
+
+*Process and format errors:*
+- Forbidden tokens in code
+- `__rte_experimental`/`__rte_internal` in .c files or not alone on line
+- Compilation failures
+- ABI breaks without proper versioning
+- pthread mutex/cond/rwlock in shared memory without `PTHREAD_PROCESS_SHARED`
+
+*API design errors (new libraries only):*
+- Ops/callback struct with 20+ function pointers in an installed header
+- Callback struct members with no Doxygen documentation
+- Void-returning callbacks for failable operations (errors silently swallowed)
+
+**Warning** (should fix):
+- Missing Cc: stable@dpdk.org for fixes
+- Documentation gaps
+- Documentation does not match code behavior
+- PMD features missing from `doc/guides/nics/features/` matrix
+- Device operations not documented per `features.rst` mappings
+- Missing tests
+- Functional tests not using TEST_ASSERT macros or unit_test_suite_runner
+- New API not marked as `__rte_experimental`
+- New API without testpmd hooks or functional tests
+- New public function missing `RTE_EXPORT_*` macro
+- API changes without release notes
+- New drivers or subsystems without release notes
+- Implicit comparisons (`!ptr` instead of `ptr == NULL`)
+- Unnecessary variable initialization
+- Unnecessary casts of `void *`
+- Unnecessary NULL checks before free
+- Inappropriate use of `rte_malloc()` or `rte_memcpy()`
+- Use of `perror()`, `printf()`, `fprintf()` in libraries or drivers (allowed in examples and test code)
+- Driver/library global variables without unique prefixes (static linking clash risk)
+- Usage of deprecated APIs, macros, or functions in new code
+- RST documentation using bullet lists where definition lists would be more appropriate
+- Ops/callback struct with >5 function pointers in an installed header (ABI risk)
+- New API using fixed enum+union where TLV pattern would be more extensible
+- Installed header labeled "private" or "internal" in meson.build
+- New library using global singleton instead of handle-based API
+- Static function pointer array not declared `const` when contents are compile-time constant
+- `int` used instead of `bool` for variables or return values that are purely true/false
+- `rte_memory_order_seq_cst` used where weaker ordering (`relaxed`, `acquire`/`release`) suffices
+- Standalone `rte_atomic_thread_fence()` where ordering on the atomic operation itself would be clearer
+- `getenv()` used in a driver or library for runtime configuration instead of devargs
+- Hardcoded Ethernet overhead constant instead of per-device overhead calculation
+- PMD does not advertise `RTE_ETH_RX_OFFLOAD_SCATTER` in `rx_offload_capa` but hardware supports multi-segment Rx
+- PMD `dev_info` reports `max_rx_pktlen` or `max_mtu` inconsistent with each other or with the Ethernet overhead
+- `mtu_set` callback does not re-select the Rx burst function after changing MTU
+
+**Do NOT flag** (common false positives):
+- Missing `version.map` updates (maps are auto-generated from `RTE_EXPORT_*` macros)
+- Suggesting manual edits to any `version.map` file
+- SPDX/copyright format, copyright years, copyright holders (not subject to AI review)
+- Commit message formatting (subject length, punctuation, tag order, case-sensitive terms) -- checked by checkpatch
+- Meson file lines under 100 characters
+- Comparisons using `== 0`, `!= 0`, `== NULL`, `!= NULL` as "implicit" (these ARE explicit)
+- Comparisons wrapped in `likely()` or `unlikely()` macros - these are still explicit if using == or !=
+- Anything you determine is correct (do not mention non-issues or say "No issue here")
+- `REGISTER_FAST_TEST` using `NOHUGE_OK`/`ASAN_OK` macros (this is the correct current format)
+- Missing release notes for test-only changes (unit tests do not require release notes)
+- Missing release notes for internal APIs or helper functions (only public APIs need release notes)
+- Any item you later correct with "(Correction: ...)" or "actually acceptable" - just omit it
+- Vague concerns ("should be verified", "should be checked") - if you're not sure it's wrong, don't flag it
+- Items where you say "which is correct" or "this is correct" - if it's correct, don't mention it at all
+- Items where you conclude "no issue here" or "this is actually correct" - omit these entirely
+- Clean patches in a series - do not include a patch just to say "no issues" or describe what it does
+- Cross-patch compilation dependencies - you cannot determine patch ordering correctness from review
+- Claims that a symbol "was removed in patch N" causing issues in patch M - assume author ordered correctly
+- Any speculation about whether patches will compile when applied in sequence
+- Mutexes/locks in process-private memory (standard `malloc`, stack, static non-shared) - these don't need `PTHREAD_PROCESS_SHARED`
+- Use of `rte_spinlock_t` or `rte_rwlock_t` in shared memory (these work correctly without special init)
+- `volatile` used for MMIO/hardware register access in drivers (this is correct usage)
+- Left shift of `uint8_t`/`uint16_t` where the result is stored in a `uint32_t` or narrower variable and not used in pointer arithmetic or 64-bit context (sign extension cannot occur)
+- `getenv()` used in EAL, examples, app/test, or build/config scripts (only flag in drivers/ and lib/)
+- Reading `rxmode.mtu` inside `rte_eth_dev_configure()` implementation (that is where the user request is consumed)
+- `=` assignment to MTU or frame length fields during initial setup (only flag stale reads of `rxmode.mtu` outside configure)
+- PMDs that auto-enable scatter when MTU exceeds mbuf size (this is the correct pattern)
+- Hardcoded `RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN` as overhead when the PMD does not support VLAN and device info is consistent
+- Tagged frames exceeding 1518 bytes at standard MTU — a single-tagged frame of 1522 bytes is valid at MTU 1500 (the outer VLAN header is L2 overhead, not payload). Note: inner VLAN tags in QinQ *do* consume MTU; see the MTU section for details.
+
+**Info** (consider):
+- Minor style preferences
+- Optimization suggestions
+- Alternative approaches
+
+---
+
+# Response Format
+
+When you identify an issue:
+1. **State the problem** (1 sentence)
+2. **Why it matters** (1 sentence, only if not obvious)
+3. **Suggested fix** (code snippet or specific action)
+
+Example:
+This could panic if the string is NULL.
+
+---
+
+## FINAL CHECK BEFORE SUBMITTING REVIEW
+
+Before outputting your review, do two separate passes:
+
+### Pass 1: Verify correctness bugs are included
+
+Ask: "Did I trace every error path for resource leaks? Did I check
+for use-after-free? Did I verify error codes are propagated?"
+
+If you identified a potential correctness bug but talked yourself
+out of it, **add it back**. It is better to report a possible bug
+than to miss a real one.
+
+### Pass 2: Remove style/process false positives
+
+For EACH style/process item, ask: "Did I conclude this is actually
+fine/correct/acceptable/no issue?"
+
+If YES, DELETE THAT ITEM. It should not be in your output.
+
+An item that says "X is wrong... actually this is correct" is a
+FALSE POSITIVE and must be removed. This applies to style, format,
+and process items only.
+
+**If your Errors section would be empty after this check, that's
+fine -- it means the patches are good.**
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v12 2/6] devtools: add multi-provider AI patch review script
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
@ 2026-04-01 15:38     ` Stephen Hemminger
  2026-04-02  4:00       ` sunyuechi
  2026-04-01 15:38     ` [PATCH v12 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
                       ` (3 subsequent siblings)
  5 siblings, 1 reply; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-01 15:38 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

This is an AI generated script to review DPDK patches against
the AGENTS.md coding guidelines using AI language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script reads a patch file and the AGENTS.md guidelines, then
submits them to the selected AI provider for review. Results are
organized by severity level (Error, Warning, Info) as defined in
the guidelines.

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large patches
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/analyze-patch.py 0001-net-ixgbe-fix-something.patch
  ./devtools/analyze-patch.py -p xai my-patch.patch
  ./devtools/analyze-patch.py -l  # list providers

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/analyze-patch.py | 1528 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1528 insertions(+)
 create mode 100755 devtools/analyze-patch.py

diff --git a/devtools/analyze-patch.py b/devtools/analyze-patch.py
new file mode 100755
index 0000000000..99748f0f04
--- /dev/null
+++ b/devtools/analyze-patch.py
@@ -0,0 +1,1528 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Analyze DPDK patches using AI providers.
+
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from dataclasses import dataclass, field
+from datetime import date
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any, Iterator
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Large file handling modes
+LARGE_FILE_MODES = ["error", "truncate", "chunk", "commits-only", "summary"]
+
+# Approximate tokens per character (conservative estimate for code)
+CHARS_PER_TOKEN = 3.5
+
+# Default token limits by provider (leaving room for system prompt and response)
+PROVIDER_INPUT_LIMITS = {
+    "anthropic": 180000,  # 200K context, reserve for system/response
+    "openai": 900000,  # GPT-4.1 has 1M context
+    "xai": 1800000,  # Grok 4.1 Fast has 2M context
+    "google": 900000,  # Gemini 3 Flash has 1M context
+}
+
+
+@dataclass
+class TokenUsage:
+    """Accumulated token usage across API calls."""
+
+    input_tokens: int = 0
+    output_tokens: int = 0
+    cache_creation_tokens: int = 0
+    cache_read_tokens: int = 0
+    api_calls: int = 0
+
+    def add(self, other: "TokenUsage") -> None:
+        """Accumulate usage from another TokenUsage."""
+        self.input_tokens += other.input_tokens
+        self.output_tokens += other.output_tokens
+        self.cache_creation_tokens += other.cache_creation_tokens
+        self.cache_read_tokens += other.cache_read_tokens
+        self.api_calls += other.api_calls
+
+
+# Pricing per million tokens (USD) - update as prices change.
+# Keys are (provider, model-prefix) tuples; first prefix match wins.
+# "default" key is fallback for unknown models within a provider.
+PRICING: dict[str, dict[str, dict[str, float]]] = {
+    "anthropic": {
+        "claude-opus-4": {
+            "input": 15.0, "output": 75.0,
+            "cache_write": 18.75, "cache_read": 1.50,
+        },
+        "claude-sonnet-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+        "claude-haiku-4": {
+            "input": 0.80, "output": 4.0,
+            "cache_write": 1.0, "cache_read": 0.08,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+    },
+    "openai": {
+        "gpt-4.1": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+        "gpt-4.1-mini": {
+            "input": 0.40, "output": 1.60,
+            "cache_write": 0.40, "cache_read": 0.10,
+        },
+        "gpt-4.1-nano": {
+            "input": 0.10, "output": 0.40,
+            "cache_write": 0.10, "cache_read": 0.025,
+        },
+        "default": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+    },
+    "xai": {
+        "grok-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+    },
+    "google": {
+        "gemini-3-flash": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+        "default": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+    },
+}
+
+
+def get_pricing(provider: str, model: str) -> dict[str, float]:
+    """Look up per-million-token pricing for a provider/model."""
+    provider_prices = PRICING.get(provider, {})
+    for prefix, prices in provider_prices.items():
+        if prefix != "default" and model.startswith(prefix):
+            return prices
+    return provider_prices.get(
+        "default", {"input": 0, "output": 0, "cache_write": 0, "cache_read": 0}
+    )
+
+
+def estimate_cost(usage: TokenUsage, provider: str, model: str) -> float:
+    """Estimate cost in USD from token usage."""
+    prices = get_pricing(provider, model)
+    cost = 0.0
+    # Non-cached input tokens = total input - cache_read
+    # (cache_creation tokens are billed at cache_write rate)
+    regular_input = usage.input_tokens - usage.cache_read_tokens
+    cost += regular_input * prices.get("input", 0) / 1_000_000
+    cost += usage.output_tokens * prices.get("output", 0) / 1_000_000
+    cost += usage.cache_creation_tokens * prices.get("cache_write", 0) / 1_000_000
+    cost += usage.cache_read_tokens * prices.get("cache_read", 0) / 1_000_000
+    return cost
+
+
+def format_token_summary(
+    usage: TokenUsage, provider: str, model: str, show_costs: bool
+) -> str:
+    """Format a token usage summary string."""
+    lines = ["=== Token Usage Summary ==="]
+    lines.append(f"API calls:     {usage.api_calls}")
+    lines.append(f"Input tokens:  {usage.input_tokens:,}")
+    lines.append(f"Output tokens: {usage.output_tokens:,}")
+    if usage.cache_creation_tokens:
+        lines.append(f"Cache write:   {usage.cache_creation_tokens:,}")
+    if usage.cache_read_tokens:
+        lines.append(f"Cache read:    {usage.cache_read_tokens:,}")
+    total = usage.input_tokens + usage.output_tokens
+    lines.append(f"Total tokens:  {total:,}")
+    if show_costs:
+        cost = estimate_cost(usage, provider, model)
+        lines.append(f"Est. cost:     ${cost:.4f}")
+    lines.append("=" * 27)
+    return "\n".join(lines)
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# LTS releases: any DPDK release with minor version .11
+# (e.g., 19.11, 20.11, 21.11, 22.11, 23.11, 24.11, 25.11, ...)
+
+SYSTEM_PROMPT_BASE = """\
+You are an expert DPDK code reviewer. Analyze patches for compliance with \
+DPDK coding standards and contribution guidelines. Provide clear, actionable \
+feedback organized by severity (Error, Warning, Info) as defined in the \
+guidelines."""
+
+LTS_RULES = """
+LTS (Long Term Stable) branch rules apply:
+- Only bug fixes allowed, no new features
+- No new APIs (experimental or stable)
+- ABI must remain unchanged
+- Backported fixes should reference the original commit with Fixes: tag
+- Copyright years should reflect when the code was originally written
+- Be conservative: reject changes that aren't clearly bug fixes"""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """Provide your review in plain text format.""",
+    "markdown": """Provide your review in Markdown format with:
+- Headers (##) for each severity level (Errors, Warnings, Info)
+- Bullet points for individual issues
+- Code blocks (```) for code references
+- Bold (**) for emphasis on key points""",
+    "html": """Provide your review in HTML format with:
+- <h2> tags for each severity level (Errors, Warnings, Info)
+- <ul>/<li> for individual issues
+- <pre><code> for code references
+- <strong> for emphasis on key points
+- Use appropriate semantic HTML tags
+- Do NOT include <html>, <head>, or <body> tags - just the content""",
+    "json": """Provide your review in JSON format with this structure:
+{
+  "summary": "Brief one-line summary of the review",
+  "errors": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "warnings": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "info": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "passed_checks": ["list of checks that passed"],
+  "overall_status": "PASS|WARN|FAIL"
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """Please review the following DPDK patch file '{patch_name}' \
+against the AGENTS.md guidelines. Focus on:
+
+1. Correctness bugs (resource leaks, use-after-free, race conditions, etc.)
+2. C coding style (forbidden tokens, implicit comparisons, unnecessary patterns)
+3. API and documentation requirements
+4. Any other guideline violations
+
+Note: commit message formatting and SPDX/copyright compliance are checked \
+by checkpatches.sh and should NOT be flagged here.
+
+{format_instruction}
+
+--- PATCH CONTENT ---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def is_lts_release(release: str | None) -> bool:
+    """Check if a release is an LTS release.
+
+    Per DPDK project guidelines, any release with minor version .11
+    is an LTS release (e.g., 19.11, 21.11, 23.11, 24.11, 25.11).
+    """
+    if not release:
+        return False
+    # Check for explicit -lts suffix
+    if "-lts" in release.lower():
+        return True
+    # Extract base version (e.g., "23.11" from "23.11.1" or "23.11-rc1")
+    version = release.split("-")[0]
+    parts = version.split(".")
+    if len(parts) >= 2:
+        try:
+            minor = int(parts[1])
+            return minor == 11
+        except ValueError:
+            pass
+    return False
+
+
+def estimate_tokens(text: str) -> int:
+    """Estimate token count from text length."""
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def split_mbox_patches(content: str) -> list[str]:
+    """Split an mbox file into individual patches."""
+    patches = []
+    current_patch = []
+    in_patch = False
+
+    for line in content.split("\n"):
+        # Detect start of new message in mbox format
+        if line.startswith("From ") and (
+            " Mon " in line
+            or " Tue " in line
+            or " Wed " in line
+            or " Thu " in line
+            or " Fri " in line
+            or " Sat " in line
+            or " Sun " in line
+        ):
+            if current_patch:
+                patches.append("\n".join(current_patch))
+            current_patch = [line]
+            in_patch = True
+        elif in_patch:
+            current_patch.append(line)
+
+    # Don't forget the last patch
+    if current_patch:
+        patches.append("\n".join(current_patch))
+
+    return patches if patches else [content]
+
+
+def extract_commit_messages(content: str) -> str:
+    """Extract only commit messages from patch content."""
+    patches = split_mbox_patches(content)
+    messages = []
+
+    for patch in patches:
+        lines = patch.split("\n")
+        msg_lines = []
+        in_headers = True
+        in_body = False
+        found_subject = False
+
+        for line in lines:
+            # Collect headers we care about
+            if in_headers:
+                if line.startswith("Subject:"):
+                    msg_lines.append(line)
+                    found_subject = True
+                elif line.startswith(("From:", "Date:")):
+                    msg_lines.append(line)
+                elif line.startswith((" ", "\t")) and found_subject:
+                    # Subject continuation
+                    msg_lines.append(line)
+                elif line == "":
+                    if found_subject:
+                        in_headers = False
+                        in_body = True
+                        msg_lines.append("")
+            elif in_body:
+                # Stop at the diff
+                if line.startswith("---") and not line.startswith("----"):
+                    break
+                if line.startswith("diff --git"):
+                    break
+                msg_lines.append(line)
+
+        if msg_lines:
+            messages.append("\n".join(msg_lines))
+
+    return "\n\n---\n\n".join(messages)
+
+
+def truncate_content(
+    content: str, max_tokens: float, provider: str
+) -> tuple[str, bool]:
+    """Truncate content to fit within token limit."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN)
+
+    if len(content) <= max_chars:
+        return content, False
+
+    # Try to truncate at a reasonable boundary
+    truncated = content[:max_chars]
+
+    # Find last complete diff hunk or patch boundary
+    last_diff = truncated.rfind("\ndiff --git")
+    last_patch = truncated.rfind("\nFrom ")
+
+    if last_diff > max_chars * 0.5:
+        truncated = truncated[:last_diff]
+    elif last_patch > max_chars * 0.5:
+        truncated = truncated[:last_patch]
+
+    truncated += "\n\n[... Content truncated due to size limits ...]\n"
+    return truncated, True
+
+
+def chunk_content(
+    content: str, max_tokens: int, provider: str
+) -> Iterator[tuple[str, int, int]]:
+    """Split content into chunks that fit within token limit.
+
+    Yields tuples of (chunk_content, chunk_number, total_chunks).
+    """
+    patches = split_mbox_patches(content)
+
+    if len(patches) == 1:
+        # Single large patch - split by diff sections
+        yield from chunk_single_patch(content, max_tokens)
+        return
+
+    # Multiple patches - group them to fit within limits
+    chunks = []
+    current_chunk = []
+    current_size = 0
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)  # 90% to leave margin
+
+    for patch in patches:
+        patch_size = len(patch)
+        if current_size + patch_size > max_chars and current_chunk:
+            chunks.append("\n".join(current_chunk))
+            current_chunk = []
+            current_size = 0
+
+        if patch_size > max_chars:
+            # Single patch too large, truncate it
+            if current_chunk:
+                chunks.append("\n".join(current_chunk))
+                current_chunk = []
+                current_size = 0
+            truncated, _ = truncate_content(patch, max_tokens * 0.9, provider)
+            chunks.append(truncated)
+        else:
+            current_chunk.append(patch)
+            current_size += patch_size
+
+    if current_chunk:
+        chunks.append("\n".join(current_chunk))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def chunk_single_patch(content: str, max_tokens: int) -> Iterator[tuple[str, int, int]]:
+    """Split a single large patch by diff sections."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)
+
+    # Extract header (everything before first diff)
+    first_diff = content.find("\ndiff --git")
+    if first_diff == -1:
+        # No diff sections, just truncate
+        truncated, _ = truncate_content(content, max_tokens * 0.9, "anthropic")
+        yield truncated, 1, 1
+        return
+
+    header = content[: first_diff + 1]
+    diff_content = content[first_diff + 1 :]
+
+    # Split by diff sections
+    diffs = []
+    current_diff = []
+    for line in diff_content.split("\n"):
+        if line.startswith("diff --git") and current_diff:
+            diffs.append("\n".join(current_diff))
+            current_diff = []
+        current_diff.append(line)
+    if current_diff:
+        diffs.append("\n".join(current_diff))
+
+    # Group diffs into chunks
+    chunks = []
+    current_chunk_diffs = []
+    current_size = len(header)
+
+    for diff in diffs:
+        diff_size = len(diff)
+        if current_size + diff_size > max_chars and current_chunk_diffs:
+            chunks.append(header + "\n".join(current_chunk_diffs))
+            current_chunk_diffs = []
+            current_size = len(header)
+
+        if diff_size + len(header) > max_chars:
+            # Single diff too large
+            if current_chunk_diffs:
+                chunks.append(header + "\n".join(current_chunk_diffs))
+                current_chunk_diffs = []
+            truncated_diff = diff[: max_chars - len(header) - 100]
+            truncated_diff += "\n[... diff truncated ...]\n"
+            chunks.append(header + truncated_diff)
+            current_size = len(header)
+        else:
+            current_chunk_diffs.append(diff)
+            current_size += diff_size
+
+    if current_chunk_diffs:
+        chunks.append(header + "\n".join(current_chunk_diffs))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def get_summary_prompt() -> str:
+    """Get prompt modifications for summary mode."""
+    return """
+NOTE: This is a LARGE patch series. Provide a HIGH-LEVEL summary review only:
+- Focus on overall architecture and design concerns
+- Check commit message formatting across the series
+- Identify any obvious policy violations
+- Do NOT attempt detailed line-by-line code review
+- Summarize the scope and purpose of the changes
+"""
+
+
+def format_combined_reviews(
+    reviews: list[tuple[str, str]], output_format: str, patch_name: str
+) -> str:
+    """Combine multiple chunk/patch reviews into a single output."""
+    if output_format == "json":
+        combined = {
+            "patch_file": patch_name,
+            "sections": [
+                {"label": label, "review": review} for label, review in reviews
+            ],
+        }
+        return json.dumps(combined, indent=2)
+    elif output_format == "html":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"<h2>{label}</h2>\n{review}")
+        return "\n<hr>\n".join(sections)
+    elif output_format == "markdown":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"## {label}\n\n{review}")
+        return "\n\n---\n\n".join(sections)
+    else:  # text
+        sections = []
+        for label, review in reviews:
+            sections.append(f"=== {label} ===\n\n{review}")
+        return "\n\n" + "=" * 60 + "\n\n".join(sections)
+
+
+def build_system_prompt(review_date: str, release: str | None) -> str:
+    """Build system prompt with date and release context."""
+    prompt = SYSTEM_PROMPT_BASE
+    prompt += f"\n\nCurrent date: {review_date}."
+
+    if release:
+        prompt += f"\nTarget DPDK release: {release}."
+        if is_lts_release(release):
+            prompt += LTS_RULES
+        else:
+            prompt += "\nThis is a main branch or standard release."
+            prompt += "\nNew features and experimental APIs are allowed."
+
+    return prompt
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": system_prompt},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": system_prompt},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": system_prompt}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + patch_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+    verbose: bool = False,
+) -> tuple[str, TokenUsage]:
+    """Make API request to the specified provider.
+
+    Returns a tuple of (response_text, token_usage).
+    """
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Extract token usage
+    usage = TokenUsage(api_calls=1)
+    if provider == "anthropic":
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("input_tokens", 0)
+        usage.output_tokens = raw_usage.get("output_tokens", 0)
+        usage.cache_creation_tokens = raw_usage.get(
+            "cache_creation_input_tokens", 0
+        )
+        usage.cache_read_tokens = raw_usage.get("cache_read_input_tokens", 0)
+    elif provider == "google":
+        raw_usage = result.get("usageMetadata", {})
+        usage.input_tokens = raw_usage.get("promptTokenCount", 0)
+        usage.output_tokens = raw_usage.get("candidatesTokenCount", 0)
+    else:  # openai, xai
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("prompt_tokens", 0)
+        usage.output_tokens = raw_usage.get("completion_tokens", 0)
+        # OpenAI cache details (if available)
+        cache_details = raw_usage.get("prompt_tokens_details", {})
+        if cache_details:
+            usage.cache_read_tokens = cache_details.get("cached_tokens", 0)
+
+    # Show per-call details in verbose mode
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        print(f"Input tokens: {usage.input_tokens:,}", file=sys.stderr)
+        print(f"Output tokens: {usage.output_tokens:,}", file=sys.stderr)
+        if usage.cache_creation_tokens:
+            print(
+                f"Cache creation: {usage.cache_creation_tokens:,}",
+                file=sys.stderr,
+            )
+        if usage.cache_read_tokens:
+            print(
+                f"Cache read: {usage.cache_read_tokens:,}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        text = "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+        return text, usage
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        text = "".join(part.get("text", "") for part in parts)
+        return text, usage
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        text = choices[0].get("message", {}).get("content", "")
+        return text, usage
+
+
+def get_last_message_id(patch_content: str) -> str | None:
+    """Extract Message-ID from the last patch in an mbox."""
+    msg_ids = re.findall(
+        r"^Message-I[Dd]:\s*(.+)$", patch_content, re.MULTILINE | re.IGNORECASE
+    )
+    if msg_ids:
+        msg_id = msg_ids[-1].strip()
+        # Normalize: remove < > and add them back
+        msg_id = msg_id.strip("<>")
+        return f"<{msg_id}>"
+    return None
+
+
+def get_last_subject(patch_content: str) -> str | None:
+    """Extract subject from the last patch in an mbox."""
+    # Find all Subject lines with potential continuations
+    subjects = []
+    lines = patch_content.split("\n")
+    i = 0
+    while i < len(lines):
+        if lines[i].lower().startswith("subject:"):
+            subject = lines[i][8:].strip()
+            i += 1
+            # Handle continuation lines
+            while i < len(lines) and lines[i].startswith((" ", "\t")):
+                subject += lines[i].strip()
+                i += 1
+            subjects.append(subject)
+        else:
+            i += 1
+    return subjects[-1] if subjects else None
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+) -> bool:
+    """Send review email using git send-email, sendmail, or msmtp."""
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    email_text = msg.as_string()
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(email_text, file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Write to temp file for git send-email
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".eml", delete=False) as f:
+        f.write(email_text)
+        temp_file = f.name
+
+    try:
+        # Try git send-email first
+        if get_git_config("sendemail.smtpserver"):
+            # Build command with all arguments
+            flat_cmd = ["git", "send-email", "--confirm=never", "--quiet"]
+            for addr in to_addrs:
+                flat_cmd.extend(["--to", addr])
+            for addr in cc_addrs:
+                flat_cmd.extend(["--cc", addr])
+            if from_addr:
+                flat_cmd.extend(["--from", from_addr])
+            if in_reply_to:
+                flat_cmd.extend(["--in-reply-to", in_reply_to])
+            flat_cmd.append(temp_file)
+
+            try:
+                subprocess.run(flat_cmd, check=True, capture_output=True)
+                print("Email sent via git send-email", file=sys.stderr)
+                return True
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+
+        # Try sendmail
+        try:
+            subprocess.run(
+                ["sendmail", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via sendmail", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        # Try msmtp
+        try:
+            subprocess.run(
+                ["msmtp", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via msmtp", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        error("Could not send email. Configure git send-email, sendmail, or msmtp.")
+
+    finally:
+        os.unlink(temp_file)
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Analyze DPDK patches using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s patch.patch                    # Review with default settings
+    %(prog)s -p openai my-patch.patch       # Use OpenAI ChatGPT
+    %(prog)s -f markdown patch.patch        # Output as Markdown
+    %(prog)s -f json -o review.json patch.patch  # Save JSON to file
+    %(prog)s -f html -o review.html patch.patch  # Save HTML to file
+    %(prog)s -r 24.11 patch.patch           # Review for specific release
+    %(prog)s -r 24.11-lts patch.patch       # Review for LTS branch
+    %(prog)s --send-email --to dev@dpdk.org series.mbox
+    %(prog)s --send-email --to dev@dpdk.org --dry-run series.mbox
+
+Large File Handling:
+    %(prog)s --split-patches series.mbox    # Review each patch separately
+    %(prog)s --split-patches --patch-range 1-5 series.mbox  # Review patches 1-5
+    %(prog)s --large-file=truncate patch.mbox   # Truncate to fit limit
+    %(prog)s --large-file=commits-only series.mbox  # Review commit messages only
+    %(prog)s --large-file=summary series.mbox   # High-level summary only
+    %(prog)s --large-file=chunk series.mbox     # Split and review in chunks
+
+Large File Modes:
+    error       - Fail with error (default)
+    truncate    - Truncate content to fit token limit
+    chunk       - Split into chunks and review each
+    commits-only - Extract and review only commit messages
+    summary     - Request high-level summary review
+
+LTS Releases:
+    Use -r/--release with LTS version (e.g., 24.11-lts, 23.11) to enable
+    stricter review rules: bug fixes only, no new features or APIs.
+    Any DPDK release with minor version .11 is an LTS release.
+
+Token Usage:
+    Token counts are always printed to stderr after each run.
+    %(prog)s -c patch.patch                  # Include estimated cost
+    %(prog)s -c -f json -o r.json patch.patch  # Cost in JSON metadata too
+        """,
+    )
+
+    parser.add_argument("patch_file", nargs="?", help="Patch file to analyze")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=4096,
+        help="Max tokens for response (default: 4096)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output",
+        metavar="FILE",
+        help="Write output to file instead of stdout",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+    parser.add_argument(
+        "-c",
+        "--show-costs",
+        action="store_true",
+        help="Show estimated cost alongside token usage summary",
+    )
+
+    # Date and release options
+    parser.add_argument(
+        "-D",
+        "--date",
+        metavar="YYYY-MM-DD",
+        help="Review date context (default: today)",
+    )
+    parser.add_argument(
+        "-r",
+        "--release",
+        metavar="VERSION",
+        help="Target DPDK release (e.g., 24.11, 23.11-lts)",
+    )
+
+    # Large file handling options
+    large_group = parser.add_argument_group("Large File Handling")
+    large_group.add_argument(
+        "--large-file",
+        choices=LARGE_FILE_MODES,
+        default="error",
+        metavar="MODE",
+        help="How to handle large files: error (default), truncate, "
+        "chunk, commits-only, summary",
+    )
+    large_group.add_argument(
+        "--max-tokens",
+        type=int,
+        metavar="N",
+        help="Max input tokens (default: provider-specific)",
+    )
+    large_group.add_argument(
+        "--split-patches",
+        action="store_true",
+        help="Split mbox into individual patches and review each separately",
+    )
+    large_group.add_argument(
+        "--patch-range",
+        metavar="N-M",
+        help="Review only patches N through M (1-indexed, use with --split-patches)",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check patch file is provided
+    if not args.patch_file:
+        parser.error("patch_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    patch_path = Path(args.patch_file)
+    if not patch_path.exists():
+        error(f"Patch file not found: {args.patch_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Determine review date
+    review_date = args.date or date.today().isoformat()
+
+    # Build system prompt with date and release context
+    system_prompt = build_system_prompt(review_date, args.release)
+
+    # Read files
+    agents_content = agents_path.read_text()
+    patch_content = patch_path.read_text()
+    patch_name = patch_path.name
+
+    # Determine max tokens for this provider
+    max_input_tokens = args.max_tokens or PROVIDER_INPUT_LIMITS.get(
+        args.provider, 100000
+    )
+
+    # Estimate token count
+    estimated_tokens = estimate_tokens(patch_content + agents_content)
+
+    # Accumulate token usage across all API calls
+    total_usage = TokenUsage()
+
+    # Parse patch range if specified
+    patch_start, patch_end = None, None
+    if args.patch_range:
+        try:
+            if "-" in args.patch_range:
+                start, end = args.patch_range.split("-", 1)
+                patch_start = int(start)
+                patch_end = int(end)
+            else:
+                patch_start = patch_end = int(args.patch_range)
+        except ValueError:
+            error(f"Invalid --patch-range format: {args.patch_range}")
+
+    # Handle --split-patches mode
+    if args.split_patches:
+        patches = split_mbox_patches(patch_content)
+        total_patches = len(patches)
+
+        if total_patches == 1:
+            print(
+                "Note: Only 1 patch found in mbox, --split-patches has no effect",
+                file=sys.stderr,
+            )
+        else:
+            print(
+                f"Found {total_patches} patches in mbox",
+                file=sys.stderr,
+            )
+
+            # Apply patch range filter
+            if patch_start is not None:
+                if patch_start < 1 or patch_start > total_patches:
+                    error(
+                        f"Patch range start {patch_start} out of range (1-{total_patches})"
+                    )
+                if patch_end < patch_start or patch_end > total_patches:
+                    error(
+                        f"Patch range end {patch_end} out of range ({patch_start}-{total_patches})"
+                    )
+                patches = patches[patch_start - 1 : patch_end]
+                print(
+                    f"Reviewing patches {patch_start}-{patch_end} ({len(patches)} patches)",
+                    file=sys.stderr,
+                )
+
+            # Review each patch separately
+            all_reviews = []
+            for i, patch in enumerate(patches, patch_start or 1):
+                patch_label = f"Patch {i}/{total_patches}"
+                print(f"\nReviewing {patch_label}...", file=sys.stderr)
+
+                review_text, call_usage = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    patch,
+                    f"{patch_name} ({patch_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                total_usage.add(call_usage)
+                all_reviews.append((patch_label, review_text))
+
+            # Combine reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal API call
+            estimated_tokens = 0  # Bypass size check since we've already processed
+
+    # Check if content is too large
+    is_large = estimated_tokens > max_input_tokens
+
+    if is_large:
+        print(
+            f"Warning: Estimated {estimated_tokens:,} tokens exceeds limit of "
+            f"{max_input_tokens:,}",
+            file=sys.stderr,
+        )
+
+        if args.large_file == "error":
+            error(
+                f"Patch file too large ({estimated_tokens:,} tokens). "
+                f"Use --large-file=truncate|chunk|commits-only|summary to handle, "
+                f"or --split-patches to review patches individually."
+            )
+        elif args.large_file == "truncate":
+            print("Truncating content to fit token limit...", file=sys.stderr)
+            patch_content, was_truncated = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+            if was_truncated:
+                print("Content was truncated.", file=sys.stderr)
+        elif args.large_file == "commits-only":
+            print("Extracting commit messages only...", file=sys.stderr)
+            patch_content = extract_commit_messages(patch_content)
+            new_estimate = estimate_tokens(patch_content + agents_content)
+            print(
+                f"Reduced to ~{new_estimate:,} tokens (commit messages only)",
+                file=sys.stderr,
+            )
+            if new_estimate > max_input_tokens:
+                patch_content, _ = truncate_content(
+                    patch_content, max_input_tokens, args.provider
+                )
+        elif args.large_file == "summary":
+            print("Using summary mode for large patch...", file=sys.stderr)
+            system_prompt += get_summary_prompt()
+            patch_content, _ = truncate_content(
+                patch_content, max_input_tokens, args.provider
+            )
+        elif args.large_file == "chunk":
+            print("Processing in chunks...", file=sys.stderr)
+            all_reviews = []
+            for chunk, chunk_num, total_chunks in chunk_content(
+                patch_content, max_input_tokens, args.provider
+            ):
+                chunk_label = f"Chunk {chunk_num}/{total_chunks}"
+                print(f"Reviewing {chunk_label}...", file=sys.stderr)
+
+                review_text, call_usage = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    chunk,
+                    f"{patch_name} ({chunk_label})",
+                    args.output_format,
+                    args.verbose,
+                )
+                total_usage.add(call_usage)
+                all_reviews.append((chunk_label, review_text))
+
+            # Combine chunk reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal single API call below
+            estimated_tokens = 0
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Review date: {review_date}", file=sys.stderr)
+        if args.release:
+            lts_status = " (LTS)" if is_lts_release(args.release) else ""
+            print(f"Target release: {args.release}{lts_status}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Patch file: {args.patch_file}", file=sys.stderr)
+        print(f"Estimated tokens: {estimated_tokens:,}", file=sys.stderr)
+        print(f"Max input tokens: {max_input_tokens:,}", file=sys.stderr)
+        if args.large_file != "error":
+            print(f"Large file mode: {args.large_file}", file=sys.stderr)
+        if args.split_patches:
+            print("Split patches: yes", file=sys.stderr)
+        if args.output:
+            print(f"Output file: {args.output}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API (unless already processed via chunks/split)
+    if estimated_tokens > 0:  # Not already processed
+        review_text, call_usage = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            args.output_format,
+            args.verbose,
+        )
+        total_usage.add(call_usage)
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Format output based on requested format
+    provider_name = config["name"]
+
+    if args.output_format == "json":
+        # For JSON, try to parse and add metadata
+        try:
+            review_data = json.loads(review_text)
+        except json.JSONDecodeError:
+            # If AI didn't return valid JSON, wrap the text
+            review_data = {"raw_review": review_text}
+
+        usage_data = {
+            "api_calls": total_usage.api_calls,
+            "input_tokens": total_usage.input_tokens,
+            "output_tokens": total_usage.output_tokens,
+            "total_tokens": total_usage.input_tokens + total_usage.output_tokens,
+        }
+        if total_usage.cache_creation_tokens:
+            usage_data["cache_creation_tokens"] = total_usage.cache_creation_tokens
+        if total_usage.cache_read_tokens:
+            usage_data["cache_read_tokens"] = total_usage.cache_read_tokens
+        if args.show_costs:
+            usage_data["estimated_cost_usd"] = round(
+                estimate_cost(total_usage, args.provider, model), 6
+            )
+
+        output_data = {
+            "metadata": {
+                "patch_file": patch_name,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+                "review_date": review_date,
+                "target_release": args.release,
+                "is_lts": is_lts_release(args.release) if args.release else False,
+                "token_usage": usage_data,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+    elif args.output_format == "html":
+        # Wrap HTML content with header
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"<br>Target release: {args.release}{lts_badge}"
+        output_text = f"""<!-- AI-generated review of {patch_name} -->
+<!-- Reviewed using {provider_name} ({model}) on {review_date} -->
+<div class="patch-review">
+<h1>Patch Review: {patch_name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model}) on {review_date}{release_info}</p>
+{review_text}
+</div>
+"""
+    elif args.output_format == "markdown":
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"\n*Target release: {args.release}{lts_badge}*\n"
+        output_text = f"""# Patch Review: {patch_name}
+
+*Reviewed by {provider_name} ({model}) on {review_date}*
+{release_info}
+{review_text}
+"""
+    else:  # text
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+        output_text = f"=== Patch Review: {patch_name} (via {provider_name}) ===\n"
+        output_text += f"Review date: {review_date}\n"
+        output_text += release_info
+        output_text += "\n" + review_text
+
+    # Write output
+    if args.output:
+        Path(args.output).write_text(output_text)
+        print(f"Review written to: {args.output}", file=sys.stderr)
+    else:
+        print(output_text)
+
+    # Print token usage summary
+    if total_usage.api_calls > 0:
+        print("", file=sys.stderr)
+        print(
+            format_token_summary(
+                total_usage, args.provider, model, args.show_costs
+            ),
+            file=sys.stderr,
+        )
+
+    # Send email if requested
+    if args.send_email:
+        # Email always uses plain text - warn if different format requested
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        in_reply_to = get_last_message_id(patch_content)
+        orig_subject = get_last_subject(patch_content)
+
+        if orig_subject:
+            # Remove [PATCH n/m] prefix
+            review_subject = re.sub(r"^\[PATCH[^\]]*\]\s*", "", orig_subject)
+            review_subject = f"[REVIEW] {review_subject}"
+        else:
+            review_subject = f"[REVIEW] {patch_name}"
+
+        # Build email body - always use plain text version
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+
+        email_body = f"""AI-generated review of {patch_name}
+Reviewed using {provider_name} ({model}) on {review_date}
+{release_info}
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print(f"In-Reply-To: {in_reply_to}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            in_reply_to,
+            email_body,
+            args.dry_run,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v12 3/6] devtools: add compare-reviews.sh for multi-provider analysis
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-04-01 15:38     ` Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-01 15:38 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 192 ++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100755 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100755
index 0000000000..a63eeffb71
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,192 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -e
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+VERBOSE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            FORMAT="$2"
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"
+        echo ""
+        echo "Saved to: $OUTPUT_FILE"
+    else
+        python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            $VERBOSE \
+            "$PATCH_FILE"
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v12 4/6] devtools: add multi-provider AI documentation review script
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (2 preceding siblings ...)
  2026-04-01 15:38     ` [PATCH v12 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
@ 2026-04-01 15:38     ` Stephen Hemminger
  2026-04-02  4:05       ` sunyuechi
  2026-04-01 15:38     ` [PATCH v12 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 1 reply; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-01 15:38 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add review-doc.py script that reviews DPDK documentation files for
spelling, grammar, technical correctness, and clarity using AI
language models. Supports batch processing of multiple files.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

Output formats (-f/--format):
  - text: plain text with extractable diff/msg markers (default)
  - markdown: formatted review document
  - html: complete HTML document with styling
  - json: structured data with metadata

For each input file, the script produces:
  - <basename>.{txt,md,html,json}: review in selected format
  - <basename>.diff: unified diff (text/json, or with -d flag)
  - <basename>.msg: commit message (text/json, or with -d flag)

The commit message prefix is automatically determined from the
file path (e.g., doc/guides/prog_guide: for programmer's guide).

Features:
  - Multiple file processing with glob support
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Configurable output directory via -o/--output-dir option
  - Output format selection via -f/--format option
  - Force diff/msg generation via -d/--diff option
  - Quiet mode (-q) suppresses stdout output
  - Verbose mode (-v) shows token usage and API details
  - Email integration using git sendemail configuration
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/review-doc.py doc/guides/prog_guide/mempool_lib.rst
  ./devtools/review-doc.py doc/guides/nics/*.rst
  ./devtools/review-doc.py -f html -d -o /tmp doc/guides/nics/*.rst
  ./devtools/review-doc.py --send-email --to dev@dpdk.org file.rst

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/review-doc.py | 1277 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1277 insertions(+)
 create mode 100755 devtools/review-doc.py

diff --git a/devtools/review-doc.py b/devtools/review-doc.py
new file mode 100755
index 0000000000..a07f8f3a59
--- /dev/null
+++ b/devtools/review-doc.py
@@ -0,0 +1,1277 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Review DPDK documentation files using AI providers.
+
+Produces a diff file and commit message compliant with DPDK standards.
+Accepts multiple documentation files and generates output for each.
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import getpass
+import json
+import os
+import re
+import smtplib
+import ssl
+import subprocess
+import sys
+from dataclasses import dataclass
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Map output format to file extension
+FORMAT_EXTENSIONS = {
+    "text": ".txt",
+    "markdown": ".md",
+    "html": ".html",
+    "json": ".json",
+}
+
+# Additional markers for extracting diff/msg (used with --diff flag)
+DIFF_MARKERS_INSTRUCTION = """
+
+ADDITIONALLY, at the end of your response, include these exact markers for automated extraction:
+---COMMIT_MESSAGE_START---
+(same commit message as above)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(same unified diff as above)
+---UNIFIED_DIFF_END---
+"""
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+
+@dataclass
+class TokenUsage:
+    """Accumulated token usage across API calls."""
+
+    input_tokens: int = 0
+    output_tokens: int = 0
+    cache_creation_tokens: int = 0
+    cache_read_tokens: int = 0
+    api_calls: int = 0
+
+    def add(self, other: "TokenUsage") -> None:
+        """Accumulate usage from another TokenUsage."""
+        self.input_tokens += other.input_tokens
+        self.output_tokens += other.output_tokens
+        self.cache_creation_tokens += other.cache_creation_tokens
+        self.cache_read_tokens += other.cache_read_tokens
+        self.api_calls += other.api_calls
+
+
+# Pricing per million tokens (USD) - update as prices change.
+# Keys are (provider, model-prefix) tuples; first prefix match wins.
+# "default" key is fallback for unknown models within a provider.
+PRICING: dict[str, dict[str, dict[str, float]]] = {
+    "anthropic": {
+        "claude-opus-4": {
+            "input": 15.0, "output": 75.0,
+            "cache_write": 18.75, "cache_read": 1.50,
+        },
+        "claude-sonnet-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+        "claude-haiku-4": {
+            "input": 0.80, "output": 4.0,
+            "cache_write": 1.0, "cache_read": 0.08,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+    },
+    "openai": {
+        "gpt-4.1": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+        "gpt-4.1-mini": {
+            "input": 0.40, "output": 1.60,
+            "cache_write": 0.40, "cache_read": 0.10,
+        },
+        "gpt-4.1-nano": {
+            "input": 0.10, "output": 0.40,
+            "cache_write": 0.10, "cache_read": 0.025,
+        },
+        "default": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+    },
+    "xai": {
+        "grok-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+    },
+    "google": {
+        "gemini-3-flash": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+        "default": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+    },
+}
+
+
+def get_pricing(provider: str, model: str) -> dict[str, float]:
+    """Look up per-million-token pricing for a provider/model."""
+    provider_prices = PRICING.get(provider, {})
+    for prefix, prices in provider_prices.items():
+        if prefix != "default" and model.startswith(prefix):
+            return prices
+    return provider_prices.get(
+        "default", {"input": 0, "output": 0, "cache_write": 0, "cache_read": 0}
+    )
+
+
+def estimate_cost(usage: TokenUsage, provider: str, model: str) -> float:
+    """Estimate cost in USD from token usage."""
+    prices = get_pricing(provider, model)
+    cost = 0.0
+    # Non-cached input tokens = total input - cache_read
+    regular_input = usage.input_tokens - usage.cache_read_tokens
+    cost += regular_input * prices.get("input", 0) / 1_000_000
+    cost += usage.output_tokens * prices.get("output", 0) / 1_000_000
+    cost += usage.cache_creation_tokens * prices.get("cache_write", 0) / 1_000_000
+    cost += usage.cache_read_tokens * prices.get("cache_read", 0) / 1_000_000
+    return cost
+
+
+def format_token_summary(
+    usage: TokenUsage, provider: str, model: str, show_costs: bool
+) -> str:
+    """Format a token usage summary string."""
+    lines = ["=== Token Usage Summary ==="]
+    lines.append(f"API calls:     {usage.api_calls}")
+    lines.append(f"Input tokens:  {usage.input_tokens:,}")
+    lines.append(f"Output tokens: {usage.output_tokens:,}")
+    if usage.cache_creation_tokens:
+        lines.append(f"Cache write:   {usage.cache_creation_tokens:,}")
+    if usage.cache_read_tokens:
+        lines.append(f"Cache read:    {usage.cache_read_tokens:,}")
+    total = usage.input_tokens + usage.output_tokens
+    lines.append(f"Total tokens:  {total:,}")
+    if show_costs:
+        cost = estimate_cost(usage, provider, model)
+        lines.append(f"Est. cost:     ${cost:.4f}")
+    lines.append("=" * 27)
+    return "\n".join(lines)
+
+
+# Commit prefix mappings based on file path
+COMMIT_PREFIX_MAP = [
+    ("doc/guides/prog_guide/", "doc/guides/prog_guide:"),
+    ("doc/guides/sample_app_ug/", "doc/guides/sample_app:"),
+    ("doc/guides/nics/", "doc/guides/nics:"),
+    ("doc/guides/cryptodevs/", "doc/guides/cryptodevs:"),
+    ("doc/guides/compressdevs/", "doc/guides/compressdevs:"),
+    ("doc/guides/eventdevs/", "doc/guides/eventdevs:"),
+    ("doc/guides/rawdevs/", "doc/guides/rawdevs:"),
+    ("doc/guides/bbdevs/", "doc/guides/bbdevs:"),
+    ("doc/guides/gpus/", "doc/guides/gpus:"),
+    ("doc/guides/dmadevs/", "doc/guides/dmadevs:"),
+    ("doc/guides/regexdevs/", "doc/guides/regexdevs:"),
+    ("doc/guides/mldevs/", "doc/guides/mldevs:"),
+    ("doc/guides/rel_notes/", "doc/guides/rel_notes:"),
+    ("doc/guides/linux_gsg/", "doc/guides/linux_gsg:"),
+    ("doc/guides/freebsd_gsg/", "doc/guides/freebsd_gsg:"),
+    ("doc/guides/windows_gsg/", "doc/guides/windows_gsg:"),
+    ("doc/guides/tools/", "doc/guides/tools:"),
+    ("doc/guides/testpmd_app_ug/", "doc/guides/testpmd:"),
+    ("doc/guides/howto/", "doc/guides/howto:"),
+    ("doc/guides/contributing/", "doc/guides/contributing:"),
+    ("doc/guides/platform/", "doc/guides/platform:"),
+    ("doc/guides/", "doc:"),
+    ("doc/api/", "doc/api:"),
+    ("doc/", "doc:"),
+]
+
+SYSTEM_PROMPT = """\
+You are an expert technical documentation reviewer for DPDK.
+Your task is to review documentation files and suggest improvements for:
+- Spelling errors
+- Grammar issues
+- Technical correctness
+- Clarity and readability
+- Consistency with DPDK terminology
+
+IMPORTANT COMMIT MESSAGE RULES (from check-git-log.sh):
+- Subject line MUST be ≤60 characters
+- Format: "prefix: lowercase description"
+- First word after colon must be lowercase (except acronyms like Rx, Tx, VF, MAC, API)
+- Use imperative mood (e.g., "fix typo" not "fixed typo" or "fixes typo")
+- NO trailing period on subject line
+- NO punctuation marks: , ; ! ? & |
+- NO underscores in subject after colon
+- Body lines wrapped at 75 characters
+- Body must NOT start with "It"
+- Do NOT include Signed-off-by (user adds via git commit --sign)
+- Only use "Fixes:" tag for actual errors in documentation, not style improvements
+
+Case-sensitive terms (must use exact case):
+- Rx, Tx (not RX, TX, rx, tx)
+- VF, PF (not vf, pf)
+- MAC, VLAN, RSS, API
+- Linux, Windows, FreeBSD
+
+For style/clarity improvements, do NOT use Fixes tag.
+For actual errors (wrong information, broken examples), include Fixes tag \
+if you can identify the commit."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """
+OUTPUT FORMAT:
+You must output exactly two sections:
+
+1. COMMIT_MESSAGE section containing the complete commit message
+2. UNIFIED_DIFF section containing the unified diff
+
+Use these exact markers:
+---COMMIT_MESSAGE_START---
+(commit message here)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(unified diff here)
+---UNIFIED_DIFF_END---
+
+The diff should be in unified format that can be applied with "git apply".
+If no changes are needed, output empty sections with a note.""",
+    "markdown": """
+OUTPUT FORMAT:
+Provide your review in Markdown format with:
+
+## Summary
+Brief description of changes
+
+## Commit Message
+```
+(complete commit message here, ready to use)
+```
+
+## Changes
+For each change:
+### Issue N: Brief title
+- **Location**: file path and line
+- **Problem**: description
+- **Fix**: suggested correction
+
+## Unified Diff
+```diff
+(unified diff here)
+```""",
+    "html": """
+OUTPUT FORMAT:
+Provide your review in HTML format with:
+- <h2> for sections (Summary, Commit Message, Changes, Diff)
+- <pre><code> for commit message and diff
+- <ul>/<li> for individual issues
+- Do NOT include <html>, <head>, or <body> tags - just the content
+
+Include sections for: Summary, Commit Message, Changes, Unified Diff""",
+    "json": """
+OUTPUT FORMAT:
+Provide your review as JSON with this structure:
+{
+  "summary": "Brief description of changes",
+  "commit_message": "Complete commit message ready to use",
+  "changes": [
+    {
+      "type": "spelling|grammar|technical|clarity|style",
+      "location": "line number or section",
+      "original": "original text",
+      "suggested": "corrected text",
+      "reason": "why this change"
+    }
+  ],
+  "diff": "unified diff as a string",
+  "stats": {
+    "total_issues": 0,
+    "spelling": 0,
+    "grammar": 0,
+    "technical": 0,
+    "clarity": 0
+  }
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """\
+Review the following DPDK documentation file and provide improvements.
+
+File path: {doc_file}
+Commit message prefix to use: {commit_prefix}
+
+{format_instruction}
+
+---DOCUMENT CONTENT---
+"""
+
+
+def error(msg: str) -> None:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_smtp_config() -> dict[str, Any]:
+    """Get SMTP configuration from git config sendemail settings."""
+    config = {
+        "server": get_git_config("sendemail.smtpserver"),
+        "port": get_git_config("sendemail.smtpserverport"),
+        "user": get_git_config("sendemail.smtpuser"),
+        "encryption": get_git_config("sendemail.smtpencryption"),
+        "password": get_git_config("sendemail.smtppass"),
+    }
+
+    # Set defaults
+    if not config["port"]:
+        if config["encryption"] == "ssl":
+            config["port"] = "465"
+        else:
+            config["port"] = "587"
+
+    # Convert port to int
+    if config["port"]:
+        config["port"] = int(config["port"])
+
+    return config
+
+
+def get_commit_prefix(filepath: str) -> str:
+    """Determine commit message prefix from file path."""
+    for prefix_path, prefix in COMMIT_PREFIX_MAP:
+        if filepath.startswith(prefix_path):
+            return prefix
+    return "doc:"
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    user_prompt = USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+    return {
+        "contents": [
+            {"role": "user", "parts": [{"text": SYSTEM_PROMPT}]},
+            {"role": "user", "parts": [{"text": agents_content}]},
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + doc_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+    verbose: bool = False,
+) -> tuple[str, TokenUsage]:
+    """Make API request to the specified provider.
+
+    Returns a tuple of (response_text, token_usage).
+    """
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {"Content-Type": "application/json"}
+        url = f"{config['endpoint']}/{model}:generateContent?key={api_key}"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request
+    request_body = json.dumps(request_data).encode("utf-8")
+    req = Request(url, data=request_body, headers=headers, method="POST")
+
+    try:
+        with urlopen(req) as response:
+            result = json.loads(response.read().decode("utf-8"))
+    except HTTPError as e:
+        error_body = e.read().decode("utf-8")
+        try:
+            error_data = json.loads(error_body)
+            error(f"API error: {error_data.get('error', error_body)}")
+        except json.JSONDecodeError:
+            error(f"API error ({e.code}): {error_body}")
+    except URLError as e:
+        error(f"Connection error: {e.reason}")
+
+    # Extract token usage
+    usage = TokenUsage(api_calls=1)
+    if provider == "anthropic":
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("input_tokens", 0)
+        usage.output_tokens = raw_usage.get("output_tokens", 0)
+        usage.cache_creation_tokens = raw_usage.get(
+            "cache_creation_input_tokens", 0
+        )
+        usage.cache_read_tokens = raw_usage.get("cache_read_input_tokens", 0)
+    elif provider == "google":
+        raw_usage = result.get("usageMetadata", {})
+        usage.input_tokens = raw_usage.get("promptTokenCount", 0)
+        usage.output_tokens = raw_usage.get("candidatesTokenCount", 0)
+    else:  # openai, xai
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("prompt_tokens", 0)
+        usage.output_tokens = raw_usage.get("completion_tokens", 0)
+        # OpenAI cache details (if available)
+        cache_details = raw_usage.get("prompt_tokens_details", {})
+        if cache_details:
+            usage.cache_read_tokens = cache_details.get("cached_tokens", 0)
+
+    # Show per-call details in verbose mode
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        print(f"Input tokens: {usage.input_tokens:,}", file=sys.stderr)
+        print(f"Output tokens: {usage.output_tokens:,}", file=sys.stderr)
+        if usage.cache_creation_tokens:
+            print(
+                f"Cache creation: {usage.cache_creation_tokens:,}",
+                file=sys.stderr,
+            )
+        if usage.cache_read_tokens:
+            print(
+                f"Cache read: {usage.cache_read_tokens:,}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        text = "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+        return text, usage
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        text = "".join(part.get("text", "") for part in parts)
+        return text, usage
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        text = choices[0].get("message", {}).get("content", "")
+        return text, usage
+
+
+def parse_review_text(review_text: str) -> tuple[str, str]:
+    """Extract commit message and diff from text format response."""
+    commit_msg = ""
+    diff = ""
+
+    # Extract commit message
+    msg_match = re.search(
+        r"---COMMIT_MESSAGE_START---\s*\n(.*?)\n---COMMIT_MESSAGE_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if msg_match:
+        commit_msg = msg_match.group(1).strip()
+
+    # Extract unified diff
+    diff_match = re.search(
+        r"---UNIFIED_DIFF_START---\s*\n(.*?)\n---UNIFIED_DIFF_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if diff_match:
+        diff = diff_match.group(1).strip()
+        # Clean up any markdown code fence if present
+        diff = re.sub(r"^```diff\s*\n?", "", diff)
+        diff = re.sub(r"\n?```\s*$", "", diff)
+
+    return commit_msg, diff
+
+
+def strip_diff_markers(text: str) -> str:
+    """Remove the diff/msg extraction markers from text."""
+    # Remove commit message markers and content
+    text = re.sub(
+        r"\n*---COMMIT_MESSAGE_START---\s*\n.*?\n---COMMIT_MESSAGE_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    # Remove unified diff markers and content
+    text = re.sub(
+        r"\n*---UNIFIED_DIFF_START---\s*\n.*?\n---UNIFIED_DIFF_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    return text.strip()
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+    verbose: bool = False,
+) -> bool:
+    """Send review email via SMTP using git sendemail config."""
+    # Build email message
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(msg.as_string(), file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Get SMTP configuration from git config
+    smtp_config = get_smtp_config()
+
+    if not smtp_config["server"]:
+        error("No SMTP server configured. Set git config sendemail.smtpserver")
+
+    server = smtp_config["server"]
+    port = smtp_config["port"]
+    user = smtp_config["user"]
+    encryption = smtp_config["encryption"]
+
+    # Get password from environment or git config, or prompt
+    password = os.environ.get("SMTP_PASSWORD") or smtp_config["password"]
+    if user and not password:
+        password = getpass.getpass(f"SMTP password for {user}@{server}: ")
+
+    if verbose:
+        print(f"SMTP server: {server}:{port}", file=sys.stderr)
+        print(f"SMTP user: {user or '(none)'}", file=sys.stderr)
+        print(f"Encryption: {encryption or 'starttls'}", file=sys.stderr)
+
+    # Collect all recipients
+    all_recipients = list(to_addrs)
+    if cc_addrs:
+        all_recipients.extend(cc_addrs)
+
+    try:
+        if encryption == "ssl":
+            # SSL/TLS connection from the start (port 465)
+            context = ssl.create_default_context()
+            with smtplib.SMTP_SSL(server, port, context=context) as smtp:
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+        else:
+            # STARTTLS (port 587) or plain (port 25)
+            with smtplib.SMTP(server, port) as smtp:
+                smtp.ehlo()
+                if encryption == "tls" or port == 587:
+                    context = ssl.create_default_context()
+                    smtp.starttls(context=context)
+                    smtp.ehlo()
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+
+        print(f"Email sent via SMTP ({server}:{port})", file=sys.stderr)
+        return True
+
+    except smtplib.SMTPAuthenticationError as e:
+        error(f"SMTP authentication failed: {e}")
+    except smtplib.SMTPException as e:
+        error(f"SMTP error: {e}")
+    except OSError as e:
+        error(f"Connection error to {server}:{port}: {e}")
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Review DPDK documentation files using AI providers. "
+        "Accepts multiple files and generates output for each.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s doc/guides/prog_guide/mempool_lib.rst
+    %(prog)s doc/guides/nics/*.rst              # Review all NIC docs
+    %(prog)s -p openai -o /tmp doc/guides/nics/ixgbe.rst doc/guides/nics/i40e.rst
+    %(prog)s -f html -d -o /tmp/reviews doc/guides/nics/*.rst  # HTML + diff files
+    %(prog)s -f json -o /tmp doc/guides/howto/flow_bifurcation.rst
+    %(prog)s --send-email --to dev@dpdk.org doc/guides/nics/ixgbe.rst
+
+Output files (in output-dir):
+    <basename>.txt|.md|.html|.json  Review in selected format
+    <basename>.diff                  Unified diff (text/json, or with --diff)
+    <basename>.msg                   Commit message (text/json, or with --diff)
+
+After review:
+    git apply <basename>.diff
+    git commit -sF <basename>.msg
+
+SMTP Configuration (from git config):
+    sendemail.smtpserver      SMTP server hostname
+    sendemail.smtpserverport  SMTP port (default: 587 for TLS, 465 for SSL)
+    sendemail.smtpuser        SMTP username
+    sendemail.smtpencryption  'tls' for STARTTLS, 'ssl' for SSL/TLS
+    sendemail.smtppass        SMTP password (or set SMTP_PASSWORD env var)
+
+Example git config:
+    git config --global sendemail.smtpserver smtp.gmail.com
+    git config --global sendemail.smtpserverport 587
+    git config --global sendemail.smtpuser yourname@gmail.com
+    git config --global sendemail.smtpencryption tls
+
+Token Usage:
+    Token counts are always printed to stderr after each run.
+    %(prog)s -c doc/guides/nics/ixgbe.rst    # Include estimated cost
+    %(prog)s -c -f json doc/guides/nics/*.rst # Cost in JSON metadata too
+        """,
+    )
+
+    parser.add_argument(
+        "doc_files",
+        nargs="+",
+        metavar="doc_file",
+        help="Documentation file(s) to review",
+    )
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=8192,
+        help="Max tokens for response (default: 8192)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output-dir",
+        default=".",
+        help="Output directory for all output files (default: .)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-q",
+        "--quiet",
+        action="store_true",
+        help="Suppress review output to stdout (only write files)",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-d",
+        "--diff",
+        action="store_true",
+        help="Always produce .diff and .msg files (automatic for text/json)",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+    parser.add_argument(
+        "-c",
+        "--show-costs",
+        action="store_true",
+        help="Show estimated cost alongside token usage summary",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    # Validate all doc files exist before processing
+    doc_paths = []
+    for doc_file in args.doc_files:
+        doc_path = Path(doc_file)
+        if not doc_path.exists():
+            error(f"Documentation file not found: {doc_file}")
+        doc_paths.append((doc_file, doc_path))
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Read AGENTS.md once
+    agents_content = agents_path.read_text()
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    provider_name = config["name"]
+
+    # Accumulate token usage across all API calls
+    total_usage = TokenUsage()
+
+    # Process each file
+    num_files = len(doc_paths)
+    for file_idx, (doc_file, doc_path) in enumerate(doc_paths, 1):
+        if num_files > 1:
+            print(
+                f"\n{'=' * 60}",
+                file=sys.stderr,
+            )
+            print(
+                f"Processing file {file_idx}/{num_files}: {doc_file}",
+                file=sys.stderr,
+            )
+            print(
+                f"{'=' * 60}",
+                file=sys.stderr,
+            )
+
+        # Determine output filenames
+        doc_basename = doc_path.stem
+        diff_file = output_dir / f"{doc_basename}.diff"
+        msg_file = output_dir / f"{doc_basename}.msg"
+
+        # Get commit prefix
+        commit_prefix = get_commit_prefix(doc_file)
+
+        # Read doc content
+        doc_content = doc_path.read_text()
+
+        if args.verbose:
+            print("=== Request ===", file=sys.stderr)
+            print(f"Provider: {args.provider}", file=sys.stderr)
+            print(f"Model: {model}", file=sys.stderr)
+            print(f"Output format: {args.output_format}", file=sys.stderr)
+            print(f"AGENTS file: {args.agents}", file=sys.stderr)
+            print(f"Doc file: {doc_file}", file=sys.stderr)
+            print(f"Commit prefix: {commit_prefix}", file=sys.stderr)
+            print(f"Output dir: {args.output_dir}", file=sys.stderr)
+            if args.send_email:
+                print("Send email: yes", file=sys.stderr)
+                print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+                if args.cc_addrs:
+                    print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+                print(f"From: {from_addr}", file=sys.stderr)
+            print("===============", file=sys.stderr)
+
+        # Call API
+        review_text, call_usage = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            args.output_format,
+            args.diff,
+            args.verbose,
+        )
+        total_usage.add(call_usage)
+
+        if not review_text:
+            print(
+                f"Warning: No response received for {doc_file}",
+                file=sys.stderr,
+            )
+            continue
+
+        # Determine review output file
+        format_ext = FORMAT_EXTENSIONS[args.output_format]
+        review_file = output_dir / f"{doc_basename}{format_ext}"
+
+        # Determine if we should write diff/msg files
+        write_diff_msg = args.diff or args.output_format in ("text", "json")
+
+        # Extract commit message and diff first (before stripping markers)
+        commit_msg, diff = "", ""
+        if write_diff_msg:
+            if args.output_format == "json":
+                # Will extract from JSON below
+                pass
+            else:
+                # Parse from text format markers
+                commit_msg, diff = parse_review_text(review_text)
+
+        # For non-text formats with --diff, strip the markers from display output
+        display_text = review_text
+        if args.diff and args.output_format in ("markdown", "html"):
+            display_text = strip_diff_markers(review_text)
+
+        # Build formatted output text
+        if args.output_format == "text":
+            output_text = review_text
+        elif args.output_format == "json":
+            # Try to parse JSON response
+            try:
+                review_data = json.loads(review_text)
+            except json.JSONDecodeError:
+                print("Warning: Response is not valid JSON", file=sys.stderr)
+                review_data = {"raw_response": review_text}
+
+            # Extract diff/msg from JSON if present
+            if write_diff_msg:
+                if isinstance(review_data, dict) and "raw_response" not in review_data:
+                    commit_msg = review_data.get("commit_message", "")
+                    diff = review_data.get("diff", "")
+
+            # Add metadata
+            usage_data = {
+                "api_calls": call_usage.api_calls,
+                "input_tokens": call_usage.input_tokens,
+                "output_tokens": call_usage.output_tokens,
+                "total_tokens": call_usage.input_tokens + call_usage.output_tokens,
+            }
+            if call_usage.cache_creation_tokens:
+                usage_data["cache_creation_tokens"] = call_usage.cache_creation_tokens
+            if call_usage.cache_read_tokens:
+                usage_data["cache_read_tokens"] = call_usage.cache_read_tokens
+            if args.show_costs:
+                usage_data["estimated_cost_usd"] = round(
+                    estimate_cost(call_usage, args.provider, model), 6
+                )
+
+            output_data = {
+                "metadata": {
+                    "doc_file": doc_file,
+                    "provider": args.provider,
+                    "provider_name": provider_name,
+                    "model": model,
+                    "commit_prefix": commit_prefix,
+                    "token_usage": usage_data,
+                },
+                "review": review_data,
+            }
+            output_text = json.dumps(output_data, indent=2)
+        elif args.output_format == "markdown":
+            output_text = f"""# Documentation Review: {doc_path.name}
+
+*Reviewed by {provider_name} ({model})*
+
+{display_text}
+"""
+        elif args.output_format == "html":
+            output_text = f"""<!DOCTYPE html>
+<html>
+<head>
+<meta charset="utf-8">
+<title>Review: {doc_path.name}</title>
+<style>
+body {{ font-family: system-ui, sans-serif; max-width: 900px; margin: 2em auto; padding: 0 1em; }}
+h1 {{ color: #333; }}
+.review-meta {{ color: #666; font-style: italic; }}
+pre {{ background: #f5f5f5; padding: 1em; overflow-x: auto; }}
+</style>
+</head>
+<body>
+<h1>Documentation Review: {doc_path.name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+<div class="review-content">
+{display_text}
+</div>
+</body>
+</html>
+"""
+
+        # Write formatted review to file
+        review_file.write_text(output_text)
+        print(f"Review written to: {review_file}", file=sys.stderr)
+
+        # Write diff/msg files
+        if write_diff_msg:
+            if commit_msg:
+                msg_file.write_text(commit_msg + "\n")
+                print(f"Commit message written to: {msg_file}", file=sys.stderr)
+            else:
+                msg_file.write_text("# No commit message generated\n")
+                print("Warning: Could not extract commit message", file=sys.stderr)
+
+            if diff:
+                diff_file.write_text(diff + "\n")
+                print(f"Diff written to: {diff_file}", file=sys.stderr)
+            else:
+                diff_file.write_text("# No changes suggested\n")
+                print("Warning: Could not extract diff", file=sys.stderr)
+
+        # Print to stdout unless quiet (or multiple files without verbose)
+        show_stdout = not args.quiet and (num_files == 1 or args.verbose)
+        if show_stdout:
+            print(
+                f"\n=== Documentation Review: {doc_path.name} "
+                f"(via {provider_name}) ==="
+            )
+            print(output_text)
+
+            # Print usage instructions for text format
+            if args.output_format == "text":
+                print("\n=== Output Files ===")
+                print(f"Commit message: {msg_file}")
+                print(f"Diff file:      {diff_file}")
+                print("\nTo apply changes:")
+                print(f"  git apply {diff_file}")
+                print(f"  git commit -sF {msg_file}")
+
+        # Send email if requested
+        if args.send_email:
+            if args.output_format != "text":
+                print(
+                    f"Note: Email will be sent as plain text regardless of "
+                    f"--format={args.output_format}",
+                    file=sys.stderr,
+                )
+
+            review_subject = f"[REVIEW] {commit_prefix} {doc_path.name}"
+
+            # Build email body
+            email_body = f"""AI-generated documentation review of {doc_file}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+            if args.verbose:
+                print("", file=sys.stderr)
+                print("=== Email Details ===", file=sys.stderr)
+                print(f"Subject: {review_subject}", file=sys.stderr)
+                print("=====================", file=sys.stderr)
+
+            send_email(
+                args.to_addrs,
+                args.cc_addrs,
+                from_addr,
+                review_subject,
+                None,
+                email_body,
+                args.dry_run,
+                args.verbose,
+            )
+
+            if not args.dry_run:
+                print("", file=sys.stderr)
+                print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+    # Print summary for multiple files
+    if num_files > 1:
+        print(f"\n{'=' * 60}", file=sys.stderr)
+        print(f"Processed {num_files} files", file=sys.stderr)
+        print(f"Output directory: {output_dir}", file=sys.stderr)
+
+    # Print token usage summary
+    if total_usage.api_calls > 0:
+        print("", file=sys.stderr)
+        print(
+            format_token_summary(
+                total_usage, args.provider, model, args.show_costs
+            ),
+            file=sys.stderr,
+        )
+
+
+if __name__ == "__main__":
+    main()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v12 5/6] doc: add AI-assisted patch review to contributing guide
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (3 preceding siblings ...)
  2026-04-01 15:38     ` [PATCH v12 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-04-01 15:38     ` Stephen Hemminger
  2026-04-01 15:38     ` [PATCH v12 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-01 15:38 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a new section to the contributing guide describing the
analyze-patch.py script which uses AI providers to review patches
against DPDK coding standards before submission to the mailing list.

The new section covers basic usage, provider selection, patch series
handling, LTS release review, and output format options. A note
clarifies that AI review supplements but does not replace human
review.

Also add a reference to the script in the new driver guide's
test tools checklist.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/contributing/new_driver.rst |  2 +
 doc/guides/contributing/patches.rst    | 59 ++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/guides/contributing/new_driver.rst b/doc/guides/contributing/new_driver.rst
index 555e875329..6c0d356cfd 100644
--- a/doc/guides/contributing/new_driver.rst
+++ b/doc/guides/contributing/new_driver.rst
@@ -210,3 +210,5 @@ Be sure to run the following test tools per patch in a patch series:
 * `check-doc-vs-code.sh`
 * `check-spdx-tag.sh`
 * Build documentation and validate how output looks
+* Optionally run ``analyze-patch.py`` for AI-assisted review
+  (see :ref:`ai_assisted_review` in the Contributing Guide)
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index 5f554d47e6..1e50799c19 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -183,6 +183,10 @@ Make your planned changes in the cloned ``dpdk`` repo. Here are some guidelines
 
 * Code and related documentation must be updated atomically in the same patch.
 
+* Consider running the :ref:`AI-assisted review <ai_assisted_review>` tool
+  before submitting to catch common issues early.
+  This is encouraged but not required.
+
 Once the changes have been made you should commit them to your local repo.
 
 For small changes, that do not require specific explanations, it is better to keep things together in the
@@ -503,6 +507,61 @@ Additionally, when contributing to the DTS tool, patches should also be checked
 the ``dts-check-format.sh`` script in the ``devtools`` directory of the DPDK repo.
 To run the script, extra :ref:`Python dependencies <dts_deps>` are needed.
 
+
+.. _ai_assisted_review:
+
+AI-Assisted Patch Review
+------------------------
+
+Contributors may optionally use the ``devtools/analyze-patch.py`` script
+to get an AI-assisted review of patches before submitting them to the mailing list.
+The script checks patches against the DPDK coding standards and contribution
+guidelines documented in ``AGENTS.md``.
+
+The script supports multiple AI providers (Anthropic Claude, OpenAI ChatGPT,
+xAI Grok, Google Gemini).  An API key for the chosen provider must be set
+in the corresponding environment variable (see ``--list-providers``).
+
+Basic usage::
+
+   # Review a single patch (default provider: Anthropic Claude)
+   devtools/analyze-patch.py my-patch.patch
+
+   # Use a different provider
+   devtools/analyze-patch.py -p openai my-patch.patch
+
+   # Review for an LTS branch (enables stricter rules)
+   devtools/analyze-patch.py -r 24.11 my-patch.patch
+
+   # List available providers and their API key variables
+   devtools/analyze-patch.py --list-providers
+
+For a patch series in an mbox file, the ``--split-patches`` option reviews
+each patch individually::
+
+   devtools/analyze-patch.py --split-patches series.mbox
+
+   # Review only a range of patches
+   devtools/analyze-patch.py --split-patches --patch-range 1-5 series.mbox
+
+When reviewing for a Long Term Stable (LTS) release, use the ``-r`` option
+with the target version.  Any DPDK release with minor version ``.11``
+(e.g., 23.11, 24.11) is automatically recognized as LTS,
+and the script will enforce stricter rules: bug fixes only, no new features or APIs.
+
+Output can be formatted as plain text (default), Markdown, HTML, or JSON::
+
+   devtools/analyze-patch.py -f markdown -o review.md my-patch.patch
+
+The review guidelines in ``AGENTS.md`` focus on correctness bug detection
+and other DPDK-specific requirements. Commit message formatting and
+SPDX/copyright compliance are checked by ``checkpatches.sh`` and are
+not duplicated in the AI review.
+
+.. note::
+
+   Always verify AI suggestions before acting on them.
+
 .. _contrib_check_compilation:
 
 Checking Compilation
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v12 6/6] MAINTAINERS: add section for AI review tools
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (4 preceding siblings ...)
  2026-04-01 15:38     ` [PATCH v12 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
@ 2026-04-01 15:38     ` Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-01 15:38 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Thomas Monjalon

Add maintainer entries for the AI-assisted code review tooling:
AGENTS.md, analyze-patch.py, compare-reviews.sh, and
review-doc.py.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0f5539f851..c052b6c203 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -109,6 +109,14 @@ F: license/
 F: .editorconfig
 F: .mailmap
 
+AI review tools
+M: Stephen Hemminger <stephen@networkplumber.org>
+M: Aaron Conole <aconole@redhat.com>
+F: AGENTS.md
+F: devtools/analyze-patch.py
+F: devtools/compare-reviews.sh
+F: devtools/review-doc.py
+
 Linux kernel uAPI headers
 M: Maxime Coquelin <maxime.coquelin@redhat.com>
 F: devtools/linux-uapi.sh
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v12 2/6] devtools: add multi-provider AI patch review script
  2026-04-01 15:38     ` [PATCH v12 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-04-02  4:00       ` sunyuechi
  0 siblings, 0 replies; 51+ messages in thread
From: sunyuechi @ 2026-04-02  4:00 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Aaron Conole

> +        sections = []
> +        for label, review in reviews:
> +            sections.append(f"<h2>{label}</h2>\n{review}")
> +        return "\n<hr>\n".join(sections)
> +    elif output_format == "markdown":
> +        sections = []
> +        for label, review in reviews:
> +            sections.append(f"## {label}\n\n{review}")
> +        return "\n\n---\n\n".join(sections)
> +    else:  # text
> +        sections = []
> +        for label, review in reviews:
> +            sections.append(f"=== {label} ===\n\n{review}")
> +        return "\n\n" + "=" * 60 + "\n\n".join(sections)
> +

return ("\n\n" + "=" * 60 + "\n\n").join(sections)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v12 4/6] devtools: add multi-provider AI documentation review script
  2026-04-01 15:38     ` [PATCH v12 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-04-02  4:05       ` sunyuechi
  0 siblings, 0 replies; 51+ messages in thread
From: sunyuechi @ 2026-04-02  4:05 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Aaron Conole

> +    try:
> +        with urlopen(req) as response:
> +            result = json.loads(response.read().decode("utf-8"))
> +    except HTTPError as e:
>
Could add a timeout to avoid hanging if the server is unresponsive. (There's also another place that uses urlopen.)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review
  2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
                     ` (8 preceding siblings ...)
  2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-04-02 19:44   ` Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
                       ` (5 more replies)
  9 siblings, 6 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-02 19:44 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add guidelines and tooling for AI-assisted code review of DPDK
patches.

AGENTS.md provides a two-tier review framework: correctness bugs
(resource leaks, use-after-free, race conditions) are reported at
>=50% confidence; style issues require >80% with false positive
suppression. Mechanical checks handled by checkpatches.sh are
excluded to avoid redundant findings.

The analyze-patch.py and review-doc.py scripts support multiple AI
providers (Anthropic, OpenAI, xAI, Google) with mbox splitting,
prompt caching, direct SMTP sending, and token usage tracking with
optional cost estimation.

v13 - incorporate review feedback
      fix bugs found by AI self review
      add release note

Stephen Hemminger (6):
  doc: add AGENTS.md for AI code review tools
  devtools: add multi-provider AI patch review script
  devtools: add compare-reviews.sh for multi-provider analysis
  devtools: add multi-provider AI documentation review script
  doc: add AI-assisted patch review to contributing guide
  MAINTAINERS: add section for AI review tools

 AGENTS.md                              | 2162 ++++++++++++++++++++++++
 MAINTAINERS                            |    8 +
 devtools/analyze-patch.py              | 1603 ++++++++++++++++++
 devtools/compare-reviews.sh            |  263 +++
 devtools/review-doc.py                 | 1341 +++++++++++++++
 doc/guides/contributing/new_driver.rst |    2 +
 doc/guides/contributing/patches.rst    |   59 +
 doc/guides/rel_notes/release_26_07.rst |    5 +
 8 files changed, 5443 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 devtools/analyze-patch.py
 create mode 100755 devtools/compare-reviews.sh
 create mode 100755 devtools/review-doc.py

-- 
2.53.0


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v13 1/6] doc: add AGENTS.md for AI code review tools
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
@ 2026-04-02 19:44     ` Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-02 19:44 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Provide structured guidelines for AI tools reviewing DPDK
patches. Focuses on correctness bug detection (resource leaks,
use-after-free, race conditions), C coding style, forbidden
tokens, API conventions, and severity classifications.

Mechanical checks already handled by checkpatches.sh (SPDX
format, commit message formatting, tag ordering) are excluded
to avoid redundant and potentially contradictory findings.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 AGENTS.md | 2162 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2162 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..d49ed859f1
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,2162 @@
+# AGENTS.md - DPDK Code Review Guidelines for AI Tools
+
+## CRITICAL INSTRUCTION - READ FIRST
+
+This document has two categories of review rules with different
+confidence thresholds:
+
+### 1. Correctness Bugs -- HIGHEST PRIORITY (report at >=50% confidence)
+
+**Always report potential correctness bugs.** These are the most
+valuable findings. When in doubt, report them with a note about
+your confidence level. A possible use-after-free or resource leak
+is worth mentioning even if you are not certain.
+
+Correctness bugs include:
+- Use-after-free (accessing memory after `free`/`rte_free`)
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference
+- Buffer overflows or out-of-bounds access
+- Uninitialized variable use in a reachable code path
+- Race conditions (unsynchronized shared state)
+- `volatile` used instead of atomic operations for inter-thread shared variables
+- `__atomic_load_n()`/`__atomic_store_n()`/`__atomic_*()` GCC built-ins instead of `rte_atomic_*_explicit()`
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` legacy barriers instead of `rte_atomic_thread_fence()`
+- Missing error checks on functions that can fail
+- Error paths that skip cleanup (goto labels, missing free/close)
+- Incorrect error propagation (wrong return value, lost errno)
+- Logic errors in conditionals (wrong operator, inverted test)
+- Integer overflow/truncation in size calculations
+- Missing bounds checks on user-supplied sizes or indices
+- `mmap()` return checked against `NULL` instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=`
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied data
+- `1 << n` on 64-bit bitmask (must use `1ULL << n` or `RTE_BIT64()`)
+- Left shift of narrow unsigned (`uint8_t`/`uint16_t`) used as 64-bit value (sign extension via implicit `int` promotion)
+- Variable assigned then overwritten before being read (dead store)
+- Same variable used as loop counter in nested loops
+- `memcpy`/`memcmp`/`memset` with same pointer for source and destination (no-op or undefined)
+- `rte_mbuf_raw_free_bulk()` called on mbufs that may originate from different mempools (Tx burst, ring dequeue)
+- MTU confused with frame length (MTU is L3 payload; frame length = MTU + L2 overhead)
+- Using `dev_conf.rxmode.mtu` after configure instead of `dev->data->mtu`
+- Hardcoded Ethernet overhead instead of per-device calculation
+- MTU set without enabling `RTE_ETH_RX_OFFLOAD_SCATTER` when frame size exceeds mbuf data room
+- `mtu_set` callback rejects valid MTU when scatter Rx is already enabled
+- Rx queue setup silently drops oversized packets instead of enabling scatter or returning an error
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size check
+
+**Do NOT self-censor correctness bugs.** If you identify a code
+path where a resource could leak or memory could be used after
+free, report it. Do not talk yourself out of it.
+
+### 2. Style, Process, and Formatting -- suppress false positives
+
+**NEVER list a style/process item under "Errors" or "Warnings" if
+you conclude it is correct.**
+
+Before outputting any style, formatting, or process error/warning,
+verify it is actually wrong. If your analysis concludes with
+phrases like "there's no issue here", "which is fine", "appears
+correct", "is acceptable", or "this is actually correct" -- then
+DO NOT INCLUDE IT IN YOUR OUTPUT AT ALL. Delete it. Omit it
+entirely.
+
+This suppression rule applies to: naming conventions,
+code style, and process compliance. It does NOT apply to
+correctness bugs listed above. (SPDX/copyright format and
+commit message formatting are handled by checkpatch and are
+excluded from AI review entirely.)
+
+---
+
+This document provides guidelines for AI-powered code review tools
+when reviewing contributions to the Data Plane Development Kit
+(DPDK). It is derived from the official DPDK contributor guidelines
+and validation scripts.
+
+## Overview
+
+DPDK follows a development process modeled on the Linux Kernel. All
+patches are reviewed publicly on the mailing list before being
+merged. AI review tools should verify compliance with the standards
+outlined below.
+
+## Review Philosophy
+
+**Correctness bugs are the primary goal of AI review.** Style and
+formatting checks are secondary. A review that catches a
+use-after-free but misses a style nit is far more valuable than
+one that catches every style issue but misses the bug.
+
+**BEFORE OUTPUTTING YOUR REVIEW**: Re-read each item.
+- For correctness bugs: keep them. If you have reasonable doubt
+  that a code path is safe, report it.
+- For style/process items: if ANY item contains phrases like "is
+  fine", "no issue", "appears correct", "is acceptable",
+  "actually correct" -- DELETE THAT ITEM. Do not include it.
+
+### Correctness review guidelines
+- Trace error paths: for every function that allocates a resource
+  or acquires a lock, verify that ALL error paths after that point
+  release it
+- Check every `goto error` and early `return`: does it clean up
+  everything allocated so far?
+- Look for use-after-free: after `free(p)`, is `p` accessed again?
+- Check that error codes are propagated, not silently dropped
+- Report at >=50% confidence; note uncertainty if appropriate
+- It is better to report a potential bug that turns out to be safe
+  than to miss a real bug
+
+### Style and process review guidelines
+- Only comment on style/process issues when you have HIGH CONFIDENCE (>80%) that an issue exists
+- Be concise: one sentence per comment when possible
+- Focus on actionable feedback, not observations
+- When reviewing text, only comment on clarity issues if the text is genuinely
+  confusing or could lead to errors.
+- Do NOT comment on copyright years, SPDX format, or copyright holders - not subject to AI review
+- Do NOT report an issue then contradict yourself - if something is acceptable, do not mention it at all
+- Do NOT include items in Errors/Warnings that you then say are "acceptable" or "correct"
+- Do NOT mention things that are correct or "not an issue" - only report actual problems
+- Do NOT speculate about contributor circumstances (employment, company policies, etc.)
+- Before adding any style item to your review, ask: "Is this actually wrong?" If no, omit it entirely.
+- NEVER write "(Correction: ...)" - if you need to correct yourself, simply omit the item entirely
+- Do NOT add vague suggestions like "should be verified" or "should be checked" - either it's wrong or don't mention it
+- Do NOT flag something as an Error then say "which is correct" in the same item
+- Do NOT say "no issue here" or "this is actually correct" - if there's no issue, do not include it in your review
+- Do NOT analyze cross-patch dependencies or compilation order - you cannot reliably determine this from patch review
+- Do NOT claim a patch "would cause compilation failure" based on symbols used in other patches in the series
+- Review each patch individually for its own correctness; assume the patch author ordered them correctly
+- When reviewing a patch series, OMIT patches that have no issues. Do not include a patch in your output just to say "no issues found" or to summarize what the patch does. Only include patches where you have actual findings to report.
+
+## Priority Areas (Review These)
+
+### Security & Safety
+- Unsafe code blocks without justification
+- Command injection risks (shell commands, user input)
+- Path traversal vulnerabilities
+- Credential exposure or hard coded secrets
+- Missing input validation on external data
+- Improper error handling that could leak sensitive info
+
+### Correctness Issues
+- Logic errors that could cause panics or incorrect behavior
+- Buffer overflows
+- Race conditions
+- **`volatile` for inter-thread synchronization**: `volatile` does not
+  provide atomicity or memory ordering between threads. Use
+  `rte_atomic_load_explicit()`/`rte_atomic_store_explicit()` with
+  appropriate `rte_memory_order_*` instead. See the Shared Variable
+  Access section under Forbidden Tokens for details.
+- Resource leaks (files, connections, memory)
+- Off-by-one errors or boundary conditions
+- Incorrect error propagation
+- **Use-after-free** (any access to memory after it has been freed)
+- **Error path resource leaks**: For every allocation or fd open,
+  trace each error path (`goto`, early `return`, conditional) to
+  verify the resource is released. Common patterns to check:
+  - `malloc`/`rte_malloc` followed by a failure that does `return -1`
+    instead of `goto cleanup`
+  - `open()`/`socket()` fd not closed on a later error
+  - Lock acquired but not released on an error branch
+  - Partially initialized structure where early fields are allocated
+    but later allocation fails without freeing the early ones
+- **Double-free / double-close**: resource freed in both a normal
+  path and an error path, or fd closed but not set to -1 allowing
+  a second close
+- **Missing error checks**: functions that can fail (malloc, open,
+  ioctl, etc.) whose return value is not checked
+- Changes to API without release notes
+- Changes to ABI on non-LTS release
+- Usage of deprecated APIs when replacements exist
+- Overly defensive code that adds unnecessary checks
+- Unnecessary comments that just restate what the code already shows (remove them)
+- **Process-shared synchronization errors** (pthread mutexes in shared memory without `PTHREAD_PROCESS_SHARED`)
+- **`mmap()` checked against NULL instead of `MAP_FAILED`**: `mmap()` returns
+  `MAP_FAILED` (i.e., `(void *)-1`) on failure, NOT `NULL`. Checking
+  `== NULL` or `!= NULL` will miss the error and use an invalid pointer.
+  ```c
+  /* BAD - mmap never returns NULL on failure */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == NULL)       /* WRONG - will not catch MAP_FAILED */
+      return -1;
+
+  /* GOOD */
+  p = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+  if (p == MAP_FAILED)
+      return -1;
+  ```
+- **Statistics accumulation using `=` instead of `+=`**: When accumulating
+  statistics (counters, byte totals, packet counts), using `=` overwrites
+  the running total with only the latest value. This silently produces
+  wrong results.
+  ```c
+  /* BAD - overwrites instead of accumulating */
+  stats->rx_packets = nb_rx;
+  stats->rx_bytes = total_bytes;
+
+  /* GOOD - accumulates over time */
+  stats->rx_packets += nb_rx;
+  stats->rx_bytes += total_bytes;
+  ```
+  Note: `=` is correct for gauge-type values (e.g., queue depth, link
+  status) and for initial assignment. Only flag when the context is
+  clearly incremental accumulation (loop bodies, per-burst counters,
+  callback tallies).
+- **Integer multiply without widening cast**: When multiplying integers
+  to produce a result wider than the operands (sizes, offsets, byte
+  counts), the multiplication is performed at the operand width and
+  the upper bits are silently lost before the assignment. This applies
+  to any narrowing scenario: 16×16 assigned to a 32-bit variable,
+  32×32 assigned to a 64-bit variable, etc.
+  ```c
+  /* BAD - 32×32 overflows before widening to 64 */
+  uint64_t total_size = num_entries * entry_size;  /* both are uint32_t */
+  size_t offset = ring->idx * ring->desc_size;     /* 32×32 → truncated */
+
+  /* BAD - 16×16 overflows before widening to 32 */
+  uint32_t byte_count = pkt_len * nb_segs;         /* both are uint16_t */
+
+  /* GOOD - widen before multiply */
+  uint64_t total_size = (uint64_t)num_entries * entry_size;
+  size_t offset = (size_t)ring->idx * ring->desc_size;
+  uint32_t byte_count = (uint32_t)pkt_len * nb_segs;
+  ```
+- **Unbounded descriptor chain traversal**: When walking a chain of
+  descriptors (virtio, DMA, NIC Rx/Tx rings) where the chain length
+  or next-index comes from guest memory or an untrusted API caller,
+  the traversal MUST have a bounds check or loop counter to prevent
+  infinite loops or out-of-bounds access from malicious/corrupt data.
+  ```c
+  /* BAD - guest controls desc[idx].next with no bound */
+  while (desc[idx].flags & VRING_DESC_F_NEXT) {
+      idx = desc[idx].next;          /* guest-supplied, unbounded */
+      process(desc[idx]);
+  }
+
+  /* GOOD - cap iterations to descriptor ring size */
+  for (i = 0; i < ring_size; i++) {
+      if (!(desc[idx].flags & VRING_DESC_F_NEXT))
+          break;
+      idx = desc[idx].next;
+      if (idx >= ring_size)          /* bounds check */
+          return -EINVAL;
+      process(desc[idx]);
+  }
+  ```
+  This applies to any chain/linked-list traversal where indices or
+  pointers originate from untrusted input (guest VMs, user-space
+  callers, network packets).
+- **Bitmask shift using `1` instead of `1ULL` on 64-bit masks**: The
+  literal `1` is `int` (32 bits). Shifting it by 32 or more is
+  undefined behavior; shifting it by less than 32 but assigning to a
+  `uint64_t` silently zeroes the upper 32 bits. Use `1ULL << n`,
+  `UINT64_C(1) << n`, or the DPDK `RTE_BIT64(n)` macro.
+  ```c
+  /* BAD - 1 is int, UB if n >= 32, wrong if result used as uint64_t */
+  uint64_t mask = 1 << bit_pos;
+  if (features & (1 << VIRTIO_NET_F_MRG_RXBUF))  /* bit 15 OK, bit 32+ UB */
+
+  /* GOOD */
+  uint64_t mask = UINT64_C(1) << bit_pos;
+  uint64_t mask = 1ULL << bit_pos;
+  uint64_t mask = RTE_BIT64(bit_pos);        /* preferred in DPDK */
+  if (features & RTE_BIT64(VIRTIO_NET_F_MRG_RXBUF))
+  ```
+  Note: `1U << n` is acceptable when the mask is known to be 32-bit
+  (e.g., `uint32_t` register fields with `n < 32`). Only flag when
+  the result is stored in, compared against, or returned as a 64-bit
+  type, or when `n` could be >= 32.
+- **Left shift of narrow unsigned type sign-extends to 64-bit**: When
+  a `uint8_t` or `uint16_t` value is left-shifted, C integer promotion
+  converts it to `int` (signed 32-bit) before the shift. If the result
+  has bit 31 set, implicit conversion to `uint64_t`, `size_t`, or use
+  in pointer arithmetic sign-extends the upper 32 bits to all-1s,
+  producing a wrong address or value. This is Coverity SIGN_EXTENSION.
+  The fix is to cast the narrow operand to an unsigned type at least as
+  wide as the target before shifting.
+  ```c
+  /* BAD - uint16_t promotes to signed int, bit 31 may set,
+   * then sign-extends when converted to 64-bit for pointer math */
+  uint16_t idx = get_index();
+  void *addr = base + (idx << wqebb_shift);      /* SIGN_EXTENSION */
+  uint64_t off = (uint64_t)(idx << shift);        /* too late: shift already in int */
+
+  /* BAD - uint8_t shift with result used as size_t */
+  uint8_t page_order = get_order();
+  size_t size = page_order << PAGE_SHIFT;          /* promotes to int first */
+
+  /* GOOD - cast before shift */
+  void *addr = base + ((uint64_t)idx << wqebb_shift);
+  uint64_t off = (uint64_t)idx << shift;
+  size_t size = (size_t)page_order << PAGE_SHIFT;
+
+  /* GOOD - intermediate unsigned variable */
+  uint32_t offset = (uint32_t)idx << wqebb_shift;  /* OK if result fits 32 bits */
+  ```
+  Note: This is distinct from the `1 << n` pattern (where the literal
+  `1` is the problem) and from the integer-multiply pattern (where
+  the operation is `*` not `<<`). The mechanism is the same C integer
+  promotion rule, but the code patterns and Coverity checker names
+  differ. Only flag when the shift result is used in a context wider
+  than 32 bits (64-bit assignment, pointer arithmetic, function
+  argument expecting `uint64_t`/`size_t`). A shift whose result is
+  stored in a `uint32_t` or narrower variable is not affected.
+- **Variable overwrite before read (dead store)**: A variable is
+  assigned a value that is unconditionally overwritten before it is
+  ever read. This usually indicates a logic error (wrong variable
+  name, missing `if`, copy-paste mistake) or at minimum is dead code.
+  ```c
+  /* BAD - first assignment is never read */
+  ret = validate_input(cfg);
+  ret = apply_config(cfg);     /* overwrites without checking first ret */
+  if (ret != 0)
+      return ret;
+
+  /* GOOD - check each return value */
+  ret = validate_input(cfg);
+  if (ret != 0)
+      return ret;
+  ret = apply_config(cfg);
+  if (ret != 0)
+      return ret;
+  ```
+  Do NOT flag cases where the initial value is intentionally a default
+  that may or may not be overwritten (e.g., `int ret = 0;` followed
+  by a conditional assignment). Only flag unconditional overwrites
+  where the first value can never be observed.
+- **Shared loop counter in nested loops**: Using the same variable as
+  the loop counter in both an outer and inner loop causes the outer
+  loop to malfunction because the inner loop modifies its counter.
+  ```c
+  /* BAD - inner loop clobbers outer loop counter */
+  int i;
+  for (i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (i = 0; i < nb_descs; i++)    /* BUG: reuses i */
+          init_desc(i);
+  }
+
+  /* GOOD - distinct loop counters */
+  for (int i = 0; i < nb_queues; i++) {
+      setup_queue(i);
+      for (int j = 0; j < nb_descs; j++)
+          init_desc(j);
+  }
+  ```
+- **`memcpy`/`memcmp`/`memset` self-argument (same pointer as both
+  operands)**: Passing the same pointer as both source and destination
+  to `memcpy()` is undefined behavior per C99. Passing the same
+  pointer to both arguments of `memcmp()` is a no-op that always
+  returns 0, indicating a logic error (usually a copy-paste mistake
+  with the wrong variable name). The same applies to `rte_memcpy()`
+  and `memmove()` with identical arguments.
+  ```c
+  /* BAD - memcpy with same src and dst is undefined behavior */
+  memcpy(buf, buf, len);
+  rte_memcpy(dst, dst, len);
+
+  /* BAD - memcmp with same pointer always returns 0 (logic error) */
+  if (memcmp(key, key, KEY_LEN) == 0)  /* always true, wrong variable? */
+
+  /* BAD - likely copy-paste: should be comparing two different MACs */
+  if (memcmp(&eth->src_addr, &eth->src_addr, RTE_ETHER_ADDR_LEN) == 0)
+
+  /* GOOD - comparing two different things */
+  memcpy(dst, src, len);
+  if (memcmp(&eth->src_addr, &eth->dst_addr, RTE_ETHER_ADDR_LEN) == 0)
+  ```
+  This pattern almost always indicates a copy-paste bug where one of
+  the arguments should be a different variable.
+- **`rte_mbuf_raw_free_bulk()` on mixed-pool mbuf arrays**: Tx burst functions
+  and ring/queue dequeue paths receive mbufs that may originate from different
+  mempools (applications are free to send mbufs from any pool).
+  `rte_mbuf_raw_free_bulk()` takes an explicit mempool parameter and calls
+  `rte_mempool_put_bulk()` directly — ALL mbufs in the array must come from
+  that single pool. If mbufs come from different pools, they are returned to
+  the wrong pool, corrupting pool accounting and causing hard-to-debug failures.
+  Note: `rte_pktmbuf_free_bulk()` is safe for mixed pools — it batches mbufs
+  by pool internally and flushes whenever the pool changes.
+  ```c
+  /* BAD - assumes all mbufs are from the same pool */
+  /* (in tx_burst completion or ring dequeue error path) */
+  rte_mbuf_raw_free_bulk(mp, mbufs, nb_mbufs);
+
+  /* GOOD - rte_pktmbuf_free_bulk handles mixed pools correctly */
+  rte_pktmbuf_free_bulk(mbufs, nb_mbufs);
+
+  /* GOOD - free individually (each mbuf returned to its own pool) */
+  for (i = 0; i < nb_mbufs; i++)
+      rte_pktmbuf_free(mbufs[i]);
+  ```
+  This applies to any path that frees mbufs submitted by the application:
+  Tx completion, Tx error cleanup, and ring/queue drain paths.
+  `rte_mbuf_raw_free_bulk()` is an optimization for the fast-free case
+  (`RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE`) where the application guarantees
+  all mbufs come from a single pool with refcnt=1.
+- **MTU confused with Ethernet frame length**: Maximum Transmission Unit
+  (MTU) is the maximum L3 payload size (e.g., 1500 bytes for standard
+  Ethernet). The maximum Ethernet *frame length* includes L2 overhead:
+  Ethernet header (14 bytes) + optional VLAN tags (4 bytes each) + CRC
+  (4 bytes). The overhead varies per device depending on supported
+  encapsulations (VLAN, QinQ, etc.). Confusing MTU with frame length
+  produces off-by-14-to-22-byte errors in packet size limits, buffer
+  sizing, and scattered Rx decisions.
+
+  **VLAN tag accounting:** The outer VLAN tag is L2 overhead and does
+  NOT count toward MTU (matching Linux and FreeBSD). A 1522-byte
+  single-tagged frame is valid at MTU 1500. However, in QinQ the
+  inner (customer) tag DOES consume MTU — it is part of the customer
+  frame. So QinQ with MTU 1500 allows only 1496 bytes of L3 payload
+  unless the port MTU is raised to 1504.
+
+  **Using `rxmode.mtu` after configure:** After `rte_eth_dev_configure()`
+  completes, the canonical MTU is stored in `dev->data->mtu`. The
+  `dev->data->dev_conf.rxmode.mtu` field is the user's *request* and
+  must not be read after configure — it becomes stale if
+  `rte_eth_dev_set_mtu()` is called later. Both configure and set_mtu
+  write to `dev->data->mtu`; PMDs should always read from there.
+
+  **Overhead calculation:** Do not hardcode a single overhead constant.
+  Use the device's own overhead calculation (typically available via
+  `dev_info.max_rx_pktlen - dev_info.max_mtu` or an internal
+  `eth_overhead` field). Different devices support different
+  encapsulations, so the overhead is not a universal constant.
+
+  **Scattered Rx decision:** PMDs compare maximum frame length
+  (MTU + per-device overhead) against Rx buffer size to decide
+  whether scattered Rx is needed. Comparing raw MTU against buffer
+  size is wrong — it underestimates the actual frame size by the
+  overhead.
+  ```c
+  /* BAD - MTU used where frame length is needed */
+  if (dev->data->mtu > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* BAD - hardcoded overhead, wrong for QinQ-capable devices */
+  #define ETHER_OVERHEAD 18  /* may be 22 or 26 for VLAN/QinQ */
+  max_frame = mtu + ETHER_OVERHEAD;
+
+  /* BAD - reading rxmode.mtu after configure (stale if set_mtu called) */
+  static int
+  mydrv_rx_queue_setup(...) {
+      mtu = dev->data->dev_conf.rxmode.mtu;  /* WRONG - may be stale */
+      ...
+  }
+
+  /* GOOD - use dev->data->mtu, the canonical post-configure value */
+  static int
+  mydrv_rx_queue_setup(...) {
+      uint16_t mtu = dev->data->mtu;
+      ...
+  }
+
+  /* GOOD - use per-device overhead for frame length calculation */
+  uint32_t frame_overhead = dev_info.max_rx_pktlen - dev_info.max_mtu;
+  uint32_t max_frame_len = dev->data->mtu + frame_overhead;
+  if (max_frame_len > rxq->buf_size)
+      enable_scattered_rx();
+
+  /* GOOD - device-specific overhead constant derived from capabilities */
+  static uint32_t
+  mydrv_eth_overhead(struct rte_eth_dev *dev) {
+      uint32_t overhead = RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_VLAN)
+          overhead += RTE_VLAN_HLEN;
+      if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_QINQ)
+          overhead += RTE_VLAN_HLEN;
+      return overhead;
+  }
+  ```
+  Note: In `rte_eth_dev_configure()` itself, reading `rxmode.mtu` is
+  correct — that is where the user's request is consumed and written
+  to `dev->data->mtu`. Only flag reads of `rxmode.mtu` *outside*
+  configure (queue setup, start, link update, MTU set, etc.).
+- **Missing scatter Rx for large MTU**: When the configured MTU
+  produces a frame size (MTU + Ethernet overhead) larger than the mbuf
+  data buffer size (`rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`),
+  the PMD MUST either enable scatter Rx (multi-segment receive) or reject
+  the configuration. Silently accepting the MTU and then truncating or
+  dropping oversized packets is a correctness bug.
+  ```c
+  /* BAD - accepts MTU but will truncate packets that don't fit */
+  static int
+  mydrv_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+  {
+      /* No check against mbuf size or scatter capability */
+      dev->data->mtu = mtu;
+      return 0;
+  }
+
+  /* BAD - rejects valid MTU even though scatter is enabled */
+  if (frame_size > mbuf_data_size)
+      return -EINVAL;  /* wrong: should allow if scatter is on */
+
+  /* GOOD - check scatter and mbuf size */
+  if (!dev->data->scattered_rx &&
+      frame_size > dev->data->min_rx_buf_size - RTE_PKTMBUF_HEADROOM)
+      return -EINVAL;
+
+  /* GOOD - auto-enable scatter when needed */
+  if (frame_size > mbuf_data_size) {
+      if (!(dev_info.rx_offload_capa & RTE_ETH_RX_OFFLOAD_SCATTER))
+          return -EINVAL;
+      dev->data->dev_conf.rxmode.offloads |=
+          RTE_ETH_RX_OFFLOAD_SCATTER;
+      dev->data->scattered_rx = 1;
+  }
+  ```
+  Key relationships:
+  - `dev_info.max_rx_pktlen`: maximum frame the hardware can receive
+  - `dev_info.max_mtu`: maximum MTU = `max_rx_pktlen` - overhead
+  - `dev_info.min_rx_bufsize`: minimum Rx buffer the HW requires
+  - `dev_info.max_rx_bufsize`: maximum single-descriptor buffer size
+  - `mbuf data size = rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM`
+  - When scatter is off: frame length must fit in a single mbuf
+  - When scatter is on: frame length can span multiple mbufs;
+    the PMD selects a scattered Rx function
+
+  This pattern should be checked in three places:
+  1. `dev_configure()` -- validate MTU against mbuf size / scatter
+  2. `rx_queue_setup()` -- select scattered vs non-scattered Rx path
+  3. `mtu_set()` -- runtime MTU change must re-validate
+- **Rx queue function selection ignoring scatter**: When a PMD has
+  separate fast-path Rx functions for scalar (single-segment) and
+  scattered (multi-segment) modes, it must select the scattered
+  variant whenever `dev->data->scattered_rx` is set OR when the
+  configured frame length exceeds the single mbuf data size.
+  Failing to do so causes the scalar Rx function to silently drop
+  or corrupt multi-segment packets.
+  ```c
+  /* BAD - only checks offload flag, ignores actual need */
+  if (rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER)
+      rx_func = mydrv_recv_scattered;
+  else
+      rx_func = mydrv_recv_single;  /* will drop oversized pkts */
+
+  /* GOOD - check both the flag and the size */
+  mbuf_size = rte_pktmbuf_data_room_size(rxq->mp) -
+              RTE_PKTMBUF_HEADROOM;
+  max_pkt = dev->data->mtu + overhead;
+  if ((rxmode->offloads & RTE_ETH_RX_OFFLOAD_SCATTER) ||
+      max_pkt > mbuf_size) {
+      dev->data->scattered_rx = 1;
+      rx_func = mydrv_recv_scattered;
+  } else {
+      rx_func = mydrv_recv_single;
+  }
+  ```
+
+### Architecture & Patterns
+- Code that violates existing patterns in the code base
+- Missing error handling
+- Code that is not safe against signals
+- **Environment variables used for driver configuration instead of devargs**:
+  Drivers must use DPDK device arguments (`devargs`) for runtime
+  configuration, not environment variables. Devargs are preferred because
+  they are obviously device-specific rather than having global impact,
+  some launch methods strip all environment variables, and devargs can
+  be associated on a per-device basis rather than per-device-type.
+  Use `rte_kvargs_parse()` on the devargs string instead.
+  ```c
+  /* BAD - environment variable for driver tuning */
+  val = getenv("MYDRV_RX_BURST_SIZE");
+  if (val != NULL)
+      burst = atoi(val);
+
+  /* GOOD - devargs parsed at probe time */
+  static const char * const valid_args[] = { "rx_burst_size", NULL };
+  kvlist = rte_kvargs_parse(devargs->args, valid_args);
+  rte_kvargs_process(kvlist, "rx_burst_size", &parse_uint, &burst);
+  ```
+  Note: `getenv()` in EAL itself or in test/example code is acceptable.
+  This rule applies to libraries under `lib/` and drivers under `drivers/`.
+
+### New Library API Design
+
+When a patch adds a new library under `lib/`, review API design in
+addition to correctness and style.
+
+**API boundary.** A library should be a compiler, not a framework.
+The model is `rte_acl`: create a context, feed input, get structured
+output, caller decides what to do with it. No callbacks needed. If
+the library requires callers to implement a callback table to
+function, the boundary is wrong — the library is asking the caller
+to be its backend.
+
+**Callback structs** (Warning / Error). Any function-pointer struct
+in an installed header is an ABI break waiting to happen. Adding or
+reordering a member breaks all consumers.
+- Prefer a single callback parameter over an ops table.
+- \>5 callbacks: **Warning** — likely needs redesign.
+- \>20 callbacks: **Error** — this is an app plugin API, not a library.
+- All callbacks must have Doxygen (contract, return values, ownership).
+- Void-returning callbacks for failable operations swallow errors —
+  flag as **Error**.
+- Callbacks serving app-specific needs (e.g. `verbose_level_get`)
+  indicate wrong code was extracted into the library.
+
+**Extensible structures.** Prefer TLV / tagged-array patterns over
+enum + union, following `rte_flow_item` and `rte_flow_action` as
+the model. Type tag + pointer to type-specific data allows adding
+types without ABI breaks. Flag as **Warning**:
+- Large enums (100+) consumers must switch on.
+- Unions that grow with every new feature.
+- Ask: "What changes when a feature is added next release?" If
+  "add an enum value and union arm" — should be TLV.
+
+**Installed headers.** If it's in `headers` or `indirect_headers`
+in meson.build, it's public API. Don't call it "private." If truly
+internal, don't install it.
+
+**Global state.** Prefer handle-based APIs (`create`/`destroy`)
+over singletons. `rte_acl` allows multiple independent classifier
+instances; new libraries should do the same.
+
+**Output ownership.** Prefer caller-allocated or library-allocated-
+caller-freed over internal static buffers. If static buffers are
+used, document lifetime and ensure Doxygen examples don't show
+stale-pointer usage.
+
+---
+
+## C Coding Style
+
+### General Formatting
+
+- **Tab width**: 8 characters (hard tabs for indentation, spaces for alignment)
+- **No trailing whitespace** on lines or at end of files
+- Files must end with a new line
+- Code style should be consistent within each file
+
+
+### Comments
+
+```c
+/* Most single-line comments look like this. */
+
+/*
+ * VERY important single-line comments look like this.
+ */
+
+/*
+ * Multi-line comments look like this. Make them real sentences. Fill
+ * them so they look like real paragraphs.
+ */
+```
+
+### Header File Organization
+
+Include order (each group separated by blank line):
+1. System/libc includes
+2. DPDK EAL includes
+3. DPDK misc library includes
+4. Application-specific includes
+
+```c
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <rte_eal.h>
+
+#include <rte_ring.h>
+#include <rte_mempool.h>
+
+#include "application.h"
+```
+
+### Header Guards
+
+```c
+#ifndef _FILE_H_
+#define _FILE_H_
+
+/* Code */
+
+#endif /* _FILE_H_ */
+```
+
+### Naming Conventions
+
+- **All external symbols** must have `RTE_` or `rte_` prefix
+- **Macros**: ALL_UPPERCASE with `RTE_` prefix
+- **Functions**: lowercase with underscores only (no CamelCase)
+- **Variables**: lowercase with underscores only
+- **Enum values**: ALL_UPPERCASE with `RTE_<ENUM>_` prefix
+
+**Exception**: Driver base directories (`drivers/*/base/`) may use different
+naming conventions when sharing code across platforms or with upstream vendor code.
+
+#### Symbol Naming for Static Linking
+
+Drivers and libraries must not expose global variables that could
+clash when statically linked with other DPDK components or
+applications. Use consistent and unique prefixes for all exported
+symbols to avoid namespace collisions.
+
+**Good practice**: Use a driver-specific or library-specific prefix for all global variables:
+
+```c
+/* Good - virtio driver uses consistent "virtio_" prefix */
+const struct virtio_ops virtio_legacy_ops = {
+	.read = virtio_legacy_read,
+	.write = virtio_legacy_write,
+	.configure = virtio_legacy_configure,
+};
+
+const struct virtio_ops virtio_modern_ops = {
+	.read = virtio_modern_read,
+	.write = virtio_modern_write,
+	.configure = virtio_modern_configure,
+};
+
+/* Good - mlx5 driver uses consistent "mlx5_" prefix */
+struct mlx5_flow_driver_ops mlx5_flow_dv_ops;
+```
+
+**Bad practice**: Generic names that may clash:
+
+```c
+/* Bad - "ops" is too generic, will clash with other drivers */
+const struct virtio_ops ops = { ... };
+
+/* Bad - "legacy_ops" could clash with other legacy implementations */
+const struct virtio_ops legacy_ops = { ... };
+
+/* Bad - "driver_config" is not unique */
+struct driver_config config;
+```
+
+**Guidelines**:
+- Prefix all global variables with the driver or library name (e.g., `virtio_`, `mlx5_`, `ixgbe_`)
+- Prefix all global functions similarly unless they use the `rte_` namespace
+- Internal static variables do not require prefixes as they have file scope
+- Consider using the `RTE_` or `rte_` prefix only for symbols that are part of the public DPDK API
+
+#### Prohibited Terminology
+
+Do not use non-inclusive naming including:
+- `master/slave` -> Use: primary/secondary, controller/worker, leader/follower
+- `blacklist/whitelist` -> Use: denylist/allowlist, blocklist/passlist
+- `cripple` -> Use: impacted, degraded, restricted, immobilized
+- `tribe` -> Use: team, squad
+- `sanity check` -> Use: coherence check, test, verification
+
+
+### Comparisons and Boolean Logic
+
+```c
+/* Pointers - compare explicitly with NULL */
+if (p == NULL)      /* Good */
+if (p != NULL)      /* Good */
+if (likely(p != NULL))   /* Good - likely/unlikely don't change this */
+if (unlikely(p == NULL)) /* Good - likely/unlikely don't change this */
+if (!p)             /* Bad - don't use ! on pointers */
+
+/* Integers - compare explicitly with zero */
+if (a == 0)         /* Good */
+if (a != 0)         /* Good */
+if (errno != 0)     /* Good - this IS explicit */
+if (likely(a != 0)) /* Good - likely/unlikely don't change this */
+if (!a)             /* Bad - don't use ! on integers */
+if (a)              /* Bad - implicit, should be a != 0 */
+
+/* Characters - compare with character constant */
+if (*p == '\0')     /* Good */
+
+/* Booleans - direct test is acceptable */
+if (flag)           /* Good for actual bool types */
+if (!flag)          /* Good for actual bool types */
+```
+
+**Explicit comparison** means using `==` or `!=` operators (e.g., `x != 0`, `p == NULL`).
+**Implicit comparison** means relying on truthiness without an operator (e.g., `if (x)`, `if (!p)`).
+**Note**: `likely()` and `unlikely()` macros do NOT affect whether a comparison is explicit or implicit.
+
+### Boolean Usage
+
+Prefer `bool` (from `<stdbool.h>`) over `int` for variables,
+parameters, and return values that are purely true/false. Using
+`bool` makes intent explicit, enables compiler diagnostics for
+misuse, and is self-documenting.
+
+```c
+/* Bad - int used as boolean flag */
+int verbose = 0;
+int is_enabled = 1;
+
+int
+check_valid(struct item *item)
+{
+	if (item->flags & ITEM_VALID)
+		return 1;
+	return 0;
+}
+
+/* Good - bool communicates intent */
+bool verbose = false;
+bool is_enabled = true;
+
+bool
+check_valid(struct item *item)
+{
+	return item->flags & ITEM_VALID;
+}
+```
+
+**Guidelines:**
+- Use `bool` for variables that only hold true/false values
+- Use `bool` return type for predicate functions (functions that
+  answer a yes/no question, often named `is_*`, `has_*`, `can_*`)
+- Use `true`/`false` rather than `1`/`0` for boolean assignments
+- Boolean variables and parameters should not use explicit
+  comparison: `if (verbose)` is correct, not `if (verbose == true)`
+- `int` is still appropriate when a value can be negative, is an
+  error code, or carries more than two states
+
+**Structure fields:**
+- `bool` occupies 1 byte. In packed or cache-critical structures,
+  consider using a bitfield or flags word instead
+- For configuration structures and non-hot-path data, `bool` is
+  preferred over `int` for flag fields
+
+```c
+/* Bad - int flags waste space and obscure intent */
+struct port_config {
+	int promiscuous;     /* 0 or 1 */
+	int link_up;         /* 0 or 1 */
+	int autoneg;         /* 0 or 1 */
+	uint16_t mtu;
+};
+
+/* Good - bool for flag fields */
+struct port_config {
+	bool promiscuous;
+	bool link_up;
+	bool autoneg;
+	uint16_t mtu;
+};
+
+/* Also good - bitfield for cache-critical structures */
+struct fast_path_config {
+	uint32_t flags;      /* bitmask of CONFIG_F_* */
+	/* ... hot-path fields ... */
+};
+```
+
+**Do NOT flag:**
+- `int` return type for functions that return error codes (0 for
+  success, negative for error) — these are NOT boolean
+- `int` used for tri-state or multi-state values
+- `int` flags in existing code where changing the type would be a
+  large, unrelated refactor
+- Bitfield or flags-word approaches in performance-critical
+  structures
+
+### Indentation and Braces
+
+```c
+/* Control statements - no braces for single statements */
+if (val != NULL)
+	val = realloc(val, newsize);
+
+/* Braces on same line as else */
+if (test)
+	stmt;
+else if (bar) {
+	stmt;
+	stmt;
+} else
+	stmt;
+
+/* Switch statements - don't indent case */
+switch (ch) {
+case 'a':
+	aflag = 1;
+	/* FALLTHROUGH */
+case 'b':
+	bflag = 1;
+	break;
+default:
+	usage();
+}
+
+/* Long conditions - double indent continuation */
+if (really_long_variable_name_1 == really_long_variable_name_2 &&
+		really_long_variable_name_3 == really_long_variable_name_4)
+	stmt;
+```
+
+### Variable Declarations
+
+- Prefer declaring variables inside the basic block where they are used
+- Variables may be declared either at the start of the block, or at point of first use (C99 style)
+- Both declaration styles are acceptable; consistency within a function is preferred
+- Initialize variables only when a meaningful value exists at declaration time
+- Use C99 designated initializers for structures
+
+```c
+/* Good - declaration at start of block */
+int ret;
+ret = some_function();
+
+/* Also good - declaration at point of use (C99 style) */
+for (int i = 0; i < count; i++)
+	process(i);
+
+/* Good - declaration in inner block where variable is used */
+if (condition) {
+	int local_val = compute();
+	use(local_val);
+}
+
+/* Bad - unnecessary initialization defeats compiler warnings */
+int ret = 0;
+ret = some_function();    /* Compiler won't warn if assignment removed */
+```
+
+### Function Format
+
+- Return type on its own line
+- Opening brace on its own line
+- Place an empty line between declarations and statements
+
+```c
+static char *
+function(int a1, int b1)
+{
+	char *p;
+
+	p = do_something(a1, b1);
+	return p;
+}
+```
+
+---
+
+## Unnecessary Code Patterns
+
+The following patterns add unnecessary code, hide bugs, or reduce performance. Avoid them.
+
+### Unnecessary Variable Initialization
+
+Do not initialize variables that will be assigned before use. This defeats the compiler's uninitialized variable warnings, hiding potential bugs.
+
+```c
+/* Bad - initialization defeats -Wuninitialized */
+int ret = 0;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - compiler will warn if any path misses assignment */
+int ret;
+if (condition)
+	ret = func_a();
+else
+	ret = func_b();
+
+/* Good - meaningful initial value */
+int count = 0;
+for (i = 0; i < n; i++)
+	if (test(i))
+		count++;
+```
+
+### Unnecessary Casts of void *
+
+In C, `void *` converts implicitly to any pointer type. Casting the result of `malloc()`, `calloc()`, `rte_malloc()`, or similar functions is unnecessary and can hide the error of a missing `#include <stdlib.h>`.
+
+```c
+/* Bad - unnecessary cast */
+struct foo *p = (struct foo *)malloc(sizeof(*p));
+struct bar *q = (struct bar *)rte_malloc(NULL, sizeof(*q), 0);
+
+/* Good - no cast needed in C */
+struct foo *p = malloc(sizeof(*p));
+struct bar *q = rte_malloc(NULL, sizeof(*q), 0);
+```
+
+Note: Casts are required in C++ but DPDK is a C project.
+
+### Zero-Length Arrays vs Variable-Length Arrays
+
+Zero-length arrays (`int arr[0]`) are a GCC extension. Use C99 flexible array members instead.
+
+```c
+/* Bad - GCC extension */
+struct msg {
+	int len;
+	char data[0];
+};
+
+/* Good - C99 flexible array member */
+struct msg {
+	int len;
+	char data[];
+};
+```
+
+### Unnecessary NULL Checks Before free()
+
+Functions like `free()`, `rte_free()`, and similar deallocation functions accept NULL pointers safely. Do not add redundant NULL checks.
+
+```c
+/* Bad - unnecessary check */
+if (ptr != NULL)
+	free(ptr);
+
+if (rte_ptr != NULL)
+	rte_free(rte_ptr);
+
+/* Good - free handles NULL */
+free(ptr);
+rte_free(rte_ptr);
+```
+
+### memset Before free() (CWE-14)
+
+Do not call `memset()` to zero memory before freeing it. The compiler may optimize away the `memset()` as a dead store (CWE-14: Compiler Removal of Code to Clear Buffers). For security-sensitive data, use `explicit_bzero()`, `rte_memset_sensitive()`, or `rte_free_sensitive()` which the compiler is not permitted to eliminate.
+
+```c
+/* Bad - compiler may eliminate memset */
+memset(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for non-sensitive data, just free */
+free(ptr);
+
+/* Good - explicit_bzero cannot be optimized away */
+explicit_bzero(secret_key, sizeof(secret_key));
+free(secret_key);
+
+/* Good - DPDK wrapper for clearing sensitive data */
+rte_memset_sensitive(secret_key, 0, sizeof(secret_key));
+free(secret_key);
+
+/* Good - for rte_malloc'd sensitive data, combined clear+free */
+rte_free_sensitive(secret_key);
+```
+
+### Appropriate Use of rte_malloc()
+
+`rte_malloc()` allocates from hugepage memory. Use it only when required:
+
+- Memory that will be accessed by DMA (NIC descriptors, packet buffers)
+- Memory shared between primary and secondary DPDK processes
+- Memory requiring specific NUMA node placement
+
+For general allocations, use standard `malloc()` which is faster and does not consume limited hugepage resources.
+
+```c
+/* Bad - rte_malloc for ordinary data structure */
+struct config *cfg = rte_malloc(NULL, sizeof(*cfg), 0);
+
+/* Good - standard malloc for control structures */
+struct config *cfg = malloc(sizeof(*cfg));
+
+/* Good - rte_malloc for DMA-accessible memory */
+struct rte_mbuf *mbufs = rte_malloc(NULL, n * sizeof(*mbufs), RTE_CACHE_LINE_SIZE);
+```
+
+### Appropriate Use of rte_memcpy()
+
+`rte_memcpy()` is optimized for bulk data transfer in the fast path. For general use, standard `memcpy()` is preferred because:
+
+- Modern compilers optimize `memcpy()` effectively
+- `memcpy()` includes bounds checking with `_FORTIFY_SOURCE`
+- `memcpy()` handles small fixed-size copies efficiently
+
+```c
+/* Bad - rte_memcpy in control path */
+rte_memcpy(&config, &default_config, sizeof(config));
+
+/* Good - standard memcpy for control path */
+memcpy(&config, &default_config, sizeof(config));
+
+/* Good - rte_memcpy for packet data in fast path */
+rte_memcpy(rte_pktmbuf_mtod(m, void *), payload, len);
+```
+
+### Non-const Function Pointer Arrays
+
+Arrays of function pointers (ops tables, dispatch tables, callback arrays)
+should be declared `const` when their contents are fixed at compile time.
+A non-`const` function pointer array can be overwritten by bugs or exploits,
+and prevents the compiler from placing the table in read-only memory.
+
+```c
+/* Bad - mutable when it doesn't need to be */
+static rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+
+/* Good - immutable dispatch table */
+static const rte_rx_burst_t rx_functions[] = {
+	rx_burst_scalar,
+	rx_burst_vec_avx2,
+	rx_burst_vec_avx512,
+};
+```
+
+**Exceptions** (do NOT flag):
+- Arrays modified at runtime for CPU feature detection or capability probing
+  (e.g., selecting a burst function based on `rte_cpu_get_flag_enabled()`)
+- Arrays containing mutable state (e.g., entries that are linked into lists)
+- Arrays populated dynamically via registration APIs
+- `dev_ops` or similar structures assigned per-device at init time
+
+Only flag when the array is fully initialized at declaration with constant
+values and never modified thereafter.
+
+---
+
+## Forbidden Tokens
+
+### Functions
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `rte_panic()` | Return error codes | lib/, drivers/ |
+| `rte_exit()` | Return error codes | lib/, drivers/ |
+| `perror()` | `RTE_LOG()` with `strerror(errno)` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `printf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `fprintf()` | `RTE_LOG()` | lib/, drivers/ (allowed in examples/, app/test/) |
+| `getenv()` | `rte_kvargs_parse()` / devargs | drivers/ (allowed in EAL, examples/, app/test/) |
+
+### Atomics and Memory Barriers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `rte_atomic16/32/64_xxx()` | C11 atomics via `rte_atomic_xxx()` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence()` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence()` |
+| `__sync_xxx()` | `rte_atomic_xxx()` |
+| `__atomic_xxx()` | `rte_atomic_xxx()` |
+| `__ATOMIC_RELAXED` etc. | `rte_memory_order_xxx` |
+| `__rte_atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+
+#### Shared Variable Access: volatile vs Atomics
+
+Variables shared between threads or between a thread and a signal
+handler **must** use atomic operations. The C `volatile` keyword is
+NOT a substitute for atomics — it prevents compiler optimization
+of accesses but provides no atomicity guarantees and no memory
+ordering between threads. On some architectures, `volatile` reads
+and writes may tear on unaligned or multi-word values.
+
+DPDK provides C11 atomic wrappers that are portable across all
+supported compilers and architectures. Always use these for shared
+state.
+
+**Reading shared variables:**
+
+```c
+/* BAD - volatile provides no atomicity or ordering guarantee */
+volatile int stop_flag;
+if (stop_flag)           /* data race, compiler/CPU can reorder */
+    return;
+
+/* BAD - direct access to shared variable without atomic */
+if (shared->running)     /* undefined behavior if another thread writes */
+    process();
+
+/* GOOD - DPDK C11 atomic wrapper */
+if (rte_atomic_load_explicit(&shared->stop_flag, rte_memory_order_acquire))
+    return;
+
+/* GOOD - relaxed is fine for statistics or polling a flag where
+ * you don't need to synchronize other memory accesses */
+count = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+```
+
+**Writing shared variables:**
+
+```c
+/* BAD - volatile write */
+volatile int *flag = &shared->ready;
+*flag = 1;
+
+/* GOOD - atomic store with appropriate ordering */
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+```
+
+**Read-modify-write operations:**
+
+```c
+/* BAD - not atomic even with volatile */
+volatile uint64_t *counter = &stats->packets;
+*counter += nb_rx;       /* TOCTOU: load, add, store is 3 operations */
+
+/* GOOD - atomic add */
+rte_atomic_fetch_add_explicit(&stats->packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Forbidden Atomic APIs in New Code
+
+New code **must not** use GCC/Clang `__atomic_*` built-ins or the
+legacy DPDK `rte_smp_*mb()` barriers. These are deprecated and
+will be removed. Use the DPDK C11 atomic wrappers instead.
+
+**GCC/Clang `__atomic_*` built-ins — do not use:**
+
+```c
+/* BAD - GCC built-in, not portable, not DPDK API */
+val = __atomic_load_n(&shared->count, __ATOMIC_RELAXED);
+__atomic_store_n(&shared->flag, 1, __ATOMIC_RELEASE);
+__atomic_fetch_add(&shared->counter, 1, __ATOMIC_RELAXED);
+__atomic_compare_exchange_n(&shared->state, &expected, desired,
+    0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+__atomic_thread_fence(__ATOMIC_SEQ_CST);
+
+/* GOOD - DPDK C11 atomic wrappers */
+val = rte_atomic_load_explicit(&shared->count, rte_memory_order_relaxed);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+rte_atomic_fetch_add_explicit(&shared->counter, 1, rte_memory_order_relaxed);
+rte_atomic_compare_exchange_strong_explicit(&shared->state, &expected, desired,
+    rte_memory_order_acq_rel, rte_memory_order_acquire);
+rte_atomic_thread_fence(rte_memory_order_seq_cst);
+```
+
+Similarly, do not use `__sync_*` built-ins (`__sync_fetch_and_add`,
+`__sync_bool_compare_and_swap`, etc.) — these are the older GCC
+atomics with implicit full barriers and are even less appropriate
+than `__atomic_*`.
+
+**Legacy DPDK barriers — do not use:**
+
+```c
+/* BAD - legacy DPDK barriers, deprecated */
+rte_smp_mb();            /* full memory barrier */
+rte_smp_rmb();           /* read memory barrier */
+rte_smp_wmb();           /* write memory barrier */
+
+/* GOOD - C11 fence with explicit ordering */
+rte_atomic_thread_fence(rte_memory_order_seq_cst);   /* replaces rte_smp_mb() */
+rte_atomic_thread_fence(rte_memory_order_acquire);    /* replaces rte_smp_rmb() */
+rte_atomic_thread_fence(rte_memory_order_release);    /* replaces rte_smp_wmb() */
+
+/* BETTER - use ordering on the atomic operation itself when possible */
+val = rte_atomic_load_explicit(&shared->flag, rte_memory_order_acquire);
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+The legacy `rte_atomic16/32/64_*()` type-specific functions (e.g.,
+`rte_atomic32_inc()`, `rte_atomic64_read()`) are also deprecated.
+Use `rte_atomic_fetch_add_explicit()`, `rte_atomic_load_explicit()`,
+etc. with standard C integer types.
+
+| Deprecated API | Replacement |
+|----------------|-------------|
+| `__atomic_load_n()` | `rte_atomic_load_explicit()` |
+| `__atomic_store_n()` | `rte_atomic_store_explicit()` |
+| `__atomic_fetch_add()` | `rte_atomic_fetch_add_explicit()` |
+| `__atomic_compare_exchange_n()` | `rte_atomic_compare_exchange_strong_explicit()` |
+| `__atomic_thread_fence()` | `rte_atomic_thread_fence()` |
+| `__ATOMIC_RELAXED` | `rte_memory_order_relaxed` |
+| `__ATOMIC_ACQUIRE` | `rte_memory_order_acquire` |
+| `__ATOMIC_RELEASE` | `rte_memory_order_release` |
+| `__ATOMIC_ACQ_REL` | `rte_memory_order_acq_rel` |
+| `__ATOMIC_SEQ_CST` | `rte_memory_order_seq_cst` |
+| `rte_smp_mb()` | `rte_atomic_thread_fence(rte_memory_order_seq_cst)` |
+| `rte_smp_rmb()` | `rte_atomic_thread_fence(rte_memory_order_acquire)` |
+| `rte_smp_wmb()` | `rte_atomic_thread_fence(rte_memory_order_release)` |
+| `rte_atomic32_inc(&v)` | `rte_atomic_fetch_add_explicit(&v, 1, rte_memory_order_relaxed)` |
+| `rte_atomic64_read(&v)` | `rte_atomic_load_explicit(&v, rte_memory_order_relaxed)` |
+
+#### Memory Ordering Guide
+
+Use the weakest ordering that is correct. Stronger ordering
+constrains hardware and compiler optimization unnecessarily.
+
+| DPDK Ordering | When to Use |
+|---------------|-------------|
+| `rte_memory_order_relaxed` | Statistics counters, polling flags where no other data depends on the value. Most common for simple counters. |
+| `rte_memory_order_acquire` | **Load** side of a flag/pointer that guards access to other shared data. Ensures subsequent reads see data published by the releasing thread. |
+| `rte_memory_order_release` | **Store** side of a flag/pointer that publishes shared data. Ensures all prior writes are visible to a thread that does an acquire load. |
+| `rte_memory_order_acq_rel` | Read-modify-write operations (e.g., `fetch_add`) that both consume and publish shared state in one operation. |
+| `rte_memory_order_seq_cst` | Rarely needed. Only when multiple independent atomic variables must be observed in a globally consistent total order. Avoid unless required. |
+
+**Common pattern — producer/consumer flag:**
+
+```c
+/* Producer thread: fill buffer, then signal ready */
+fill_buffer(buf, data, len);
+rte_atomic_store_explicit(&shared->ready, 1, rte_memory_order_release);
+
+/* Consumer thread: wait for flag, then read buffer */
+while (!rte_atomic_load_explicit(&shared->ready, rte_memory_order_acquire))
+    rte_pause();
+process_buffer(buf, len);  /* guaranteed to see producer's writes */
+```
+
+**Common pattern — statistics counter (no ordering needed):**
+
+```c
+rte_atomic_fetch_add_explicit(&port_stats->rx_packets, nb_rx,
+    rte_memory_order_relaxed);
+```
+
+#### Standalone Fences
+
+Prefer ordering on the atomic operation itself (acquire load,
+release store) over standalone fences. Standalone fences
+(`rte_atomic_thread_fence()`) are a blunt instrument that
+orders ALL memory accesses around the fence, not just the
+atomic variable you care about.
+
+```c
+/* Acceptable but less precise - standalone fence */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_relaxed);
+rte_atomic_thread_fence(rte_memory_order_release);
+
+/* Preferred - ordering on the operation itself */
+rte_atomic_store_explicit(&shared->flag, 1, rte_memory_order_release);
+```
+
+Standalone fences are appropriate when synchronizing multiple
+non-atomic writes (e.g., filling a structure before publishing
+a pointer to it) where annotating each write individually is
+impractical.
+
+#### When volatile Is Still Acceptable
+
+`volatile` remains correct for:
+- Memory-mapped I/O registers (hardware MMIO)
+- Variables shared with signal handlers in single-threaded contexts
+- Interaction with `setjmp`/`longjmp`
+
+`volatile` is NOT correct for:
+- Any variable accessed by multiple threads
+- Polling flags between lcores
+- Statistics counters updated from multiple threads
+- Flags set by one thread and read by another
+
+**Do NOT flag** `volatile` used for MMIO or hardware register access
+(common in drivers under `drivers/*/base/`).
+
+### Threading
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `pthread_create()` | `rte_thread_create()` |
+| `pthread_join()` | `rte_thread_join()` |
+| `pthread_detach()` | EAL thread functions |
+| `pthread_setaffinity_np()` | `rte_thread_set_affinity()` |
+| `rte_thread_set_name()` | `rte_thread_set_prefixed_name()` |
+| `rte_thread_create_control()` | `rte_thread_create_internal_control()` |
+
+### Process-Shared Synchronization
+
+When placing synchronization primitives in shared memory (memory accessible by multiple processes, such as DPDK primary/secondary processes or `mmap`'d regions), they **must** be initialized with process-shared attributes. Failure to do so causes **undefined behavior** that may appear to work in testing but fail unpredictably in production.
+
+#### pthread Mutexes in Shared Memory
+
+**This is an error** - mutex in shared memory without `PTHREAD_PROCESS_SHARED`:
+
+```c
+/* BAD - undefined behavior when used across processes */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+void init_shared(struct shared_data *shm) {
+	pthread_mutex_init(&shm->lock, NULL);  /* ERROR: missing pshared attribute */
+}
+```
+
+**Correct implementation**:
+
+```c
+/* GOOD - properly initialized for cross-process use */
+struct shared_data {
+	pthread_mutex_t lock;
+	int counter;
+};
+
+int init_shared(struct shared_data *shm) {
+	pthread_mutexattr_t attr;
+	int ret;
+
+	ret = pthread_mutexattr_init(&attr);
+	if (ret != 0)
+		return -ret;
+
+	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+	if (ret != 0) {
+		pthread_mutexattr_destroy(&attr);
+		return -ret;
+	}
+
+	ret = pthread_mutex_init(&shm->lock, &attr);
+	pthread_mutexattr_destroy(&attr);
+
+	return -ret;
+}
+```
+
+#### pthread Condition Variables in Shared Memory
+
+Condition variables also require the process-shared attribute:
+
+```c
+/* BAD - will not work correctly across processes */
+pthread_cond_init(&shm->cond, NULL);
+
+/* GOOD */
+pthread_condattr_t cattr;
+pthread_condattr_init(&cattr);
+pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+pthread_cond_init(&shm->cond, &cattr);
+pthread_condattr_destroy(&cattr);
+```
+
+#### pthread Read-Write Locks in Shared Memory
+
+```c
+/* BAD */
+pthread_rwlock_init(&shm->rwlock, NULL);
+
+/* GOOD */
+pthread_rwlockattr_t rwattr;
+pthread_rwlockattr_init(&rwattr);
+pthread_rwlockattr_setpshared(&rwattr, PTHREAD_PROCESS_SHARED);
+pthread_rwlock_init(&shm->rwlock, &rwattr);
+pthread_rwlockattr_destroy(&rwattr);
+```
+
+#### When to Flag This Issue
+
+Flag as an **Error** when ALL of the following are true:
+1. A `pthread_mutex_t`, `pthread_cond_t`, `pthread_rwlock_t`, or `pthread_barrier_t` is initialized
+2. The primitive is stored in shared memory (identified by context such as: structure in `rte_malloc`/`rte_memzone`, `mmap`'d memory, memory passed to secondary processes, or structures documented as shared)
+3. The initialization uses `NULL` attributes or attributes without `PTHREAD_PROCESS_SHARED`
+
+**Do NOT flag** when:
+- The mutex is in thread-local or process-private heap memory (`malloc`)
+- The mutex is a local/static variable not in shared memory
+- The code already uses `pthread_mutexattr_setpshared()` with `PTHREAD_PROCESS_SHARED`
+- The synchronization uses DPDK primitives (`rte_spinlock_t`, `rte_rwlock_t`) which are designed for shared memory
+
+#### Preferred Alternatives
+
+For DPDK code, prefer DPDK's own synchronization primitives which are designed for shared memory:
+
+| pthread Primitive | DPDK Alternative |
+|-------------------|------------------|
+| `pthread_mutex_t` | `rte_spinlock_t` (busy-wait) or properly initialized pthread mutex |
+| `pthread_rwlock_t` | `rte_rwlock_t` |
+| `pthread_spinlock_t` | `rte_spinlock_t` |
+
+Note: `rte_spinlock_t` and `rte_rwlock_t` work correctly in shared memory without special initialization, but they are spinning locks unsuitable for long wait times.
+
+### Compiler Built-ins and Attributes
+
+| Forbidden | Preferred | Notes |
+|-----------|-----------|-------|
+| `__attribute__` | RTE macros in `rte_common.h` | Except in `lib/eal/include/rte_common.h` |
+| `__alignof__` | C11 `alignof` | |
+| `__typeof__` | `typeof` | |
+| `__builtin_*` | EAL macros | Except in `lib/eal/` and `drivers/*/base/` |
+| `__reserved` | Different name | Reserved in Windows headers |
+| `#pragma` / `_Pragma` | Avoid | Except in `rte_common.h` |
+
+### Format Specifiers
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `%lld`, `%llu`, `%llx` | `%PRId64`, `%PRIu64`, `%PRIx64` |
+
+### Headers and Build
+
+| Forbidden | Preferred | Context |
+|-----------|-----------|---------|
+| `#include <linux/pci_regs.h>` | `#include <rte_pci.h>` | |
+| `install_headers()` | Meson `headers` variable | meson.build |
+| `-DALLOW_EXPERIMENTAL_API` | Not in lib/drivers/app | Build flags |
+| `allow_experimental_apis` | Not in lib/drivers/app | Meson |
+| `#undef XXX` | `// XXX is not set` | config/rte_config.h |
+| Driver headers (`*_driver.h`, `*_pmd.h`) | Public API headers | app/, examples/ |
+
+### Testing
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `REGISTER_TEST_COMMAND` | `REGISTER_<suite_name>_TEST` |
+
+### Documentation
+
+| Forbidden | Preferred |
+|-----------|-----------|
+| `http://...dpdk.org` | `https://...dpdk.org` |
+| `//doc.dpdk.org/guides/...` | `:ref:` or `:doc:` Sphinx references |
+| `::  file.svg` | `::  file.*` (wildcard extension) |
+
+---
+
+## Deprecated API Usage
+
+New patches must not introduce usage of deprecated APIs, macros, or functions.
+Deprecated items are marked with `RTE_DEPRECATED` or documented in the
+deprecation notices section of the release notes.
+
+### Rules for New Code
+
+- Do not call functions marked with `RTE_DEPRECATED` or `__rte_deprecated`
+- Do not use macros that have been superseded by newer alternatives
+- Do not use data structures or enum values marked as deprecated
+- Check `doc/guides/rel_notes/deprecation.rst` for planned deprecations
+- When a deprecated API has a replacement, use the replacement
+
+### Deprecating APIs
+
+A patch may mark an API as deprecated provided:
+
+- No remaining usages exist in the current DPDK codebase
+- The deprecation is documented in the release notes
+- A migration path or replacement API is documented
+- The `RTE_DEPRECATED` macro is used to generate compiler warnings
+
+```c
+/* Marking a function as deprecated */
+__rte_deprecated
+int
+rte_old_function(void);
+
+/* With a message pointing to the replacement */
+__rte_deprecated_msg("use rte_new_function() instead")
+int
+rte_old_function(void);
+```
+
+### Common Deprecated Patterns
+
+| Deprecated | Replacement | Notes |
+|-----------|-------------|-------|
+| `rte_atomic*_t` types | C11 atomics | Use `rte_atomic_xxx()` wrappers |
+| `rte_smp_*mb()` barriers | `rte_atomic_thread_fence()` | See Atomics section |
+| `pthread_*()` in portable code | `rte_thread_*()` | See Threading section |
+
+When reviewing patches that add new code, flag any usage of deprecated APIs
+as requiring change to use the modern replacement.
+
+---
+
+## API Tag Requirements
+
+### `__rte_experimental`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_experimental
+int
+rte_new_feature(void);
+
+/* Wrong - not alone on line */
+__rte_experimental int rte_new_feature(void);
+
+/* Wrong - in .c file */
+```
+
+### `__rte_internal`
+
+- Must appear **alone on the line** immediately preceding the return type
+- Only allowed in **header files** (not `.c` files)
+
+```c
+/* Correct */
+__rte_internal
+int
+internal_function(void);
+```
+
+### Alignment Attributes
+
+`__rte_aligned`, `__rte_cache_aligned`, `__rte_cache_min_aligned` may only be used with `struct` or `union` types:
+
+```c
+/* Correct */
+struct __rte_cache_aligned my_struct {
+	/* ... */
+};
+
+/* Wrong */
+int __rte_cache_aligned my_variable;
+```
+
+### Packed Attributes
+
+- `__rte_packed_begin` must follow `struct`, `union`, or alignment attributes
+- `__rte_packed_begin` and `__rte_packed_end` must be used in pairs
+- Cannot use `__rte_packed_begin` with `enum`
+
+```c
+/* Correct */
+struct __rte_packed_begin my_packed_struct {
+	/* ... */
+} __rte_packed_end;
+
+/* Wrong - with enum */
+enum __rte_packed_begin my_enum {
+	/* ... */
+};
+```
+
+---
+
+## Code Quality Requirements
+
+### Compilation
+
+- Each commit must compile independently (for `git bisect`)
+- No forward dependencies within a patchset
+- Test with multiple targets, compilers, and options
+- Use `devtools/test-meson-builds.sh`
+
+**Note for AI reviewers**: You cannot verify compilation order or cross-patch dependencies from patch review alone. Do NOT flag patches claiming they "would fail to compile" based on symbols used in other patches in the series. Assume the patch author has ordered them correctly.
+
+### Testing
+
+- Add tests to `app/test` unit test framework
+- New API functions must be used in `/app` test directory
+- New device APIs require at least one driver implementation
+
+#### Functional Test Infrastructure
+
+Standalone functional tests should use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure for consistency and proper integration with the DPDK test framework.
+
+```c
+#include <rte_test.h>
+
+static int
+test_feature_basic(void)
+{
+	int ret;
+
+	ret = rte_feature_init();
+	TEST_ASSERT_SUCCESS(ret, "Failed to initialize feature");
+
+	ret = rte_feature_operation();
+	TEST_ASSERT_EQUAL(ret, 0, "Operation returned unexpected value");
+
+	TEST_ASSERT_NOT_NULL(rte_feature_get_ptr(),
+		"Feature pointer should not be NULL");
+
+	return TEST_SUCCESS;
+}
+
+static struct unit_test_suite feature_testsuite = {
+	.suite_name = "feature_autotest",
+	.setup = test_feature_setup,
+	.teardown = test_feature_teardown,
+	.unit_test_cases = {
+		TEST_CASE(test_feature_basic),
+		TEST_CASE(test_feature_advanced),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_feature(void)
+{
+	return unit_test_suite_runner(&feature_testsuite);
+}
+
+REGISTER_FAST_TEST(feature_autotest, NOHUGE_OK, ASAN_OK, test_feature);
+```
+
+The `REGISTER_FAST_TEST` macro parameters are:
+- Test name (e.g., `feature_autotest`)
+- `NOHUGE_OK` or `HUGEPAGES_REQUIRED` - whether test can run without hugepages
+- `ASAN_OK` or `ASAN_FAILS` - whether test is compatible with Address Sanitizer
+- Test function name
+
+Common `TEST_ASSERT` macros:
+- `TEST_ASSERT(cond, msg, ...)` - Assert condition is true
+- `TEST_ASSERT_SUCCESS(val, msg, ...)` - Assert value equals 0
+- `TEST_ASSERT_FAIL(val, msg, ...)` - Assert value is non-zero
+- `TEST_ASSERT_EQUAL(a, b, msg, ...)` - Assert two values are equal
+- `TEST_ASSERT_NOT_EQUAL(a, b, msg, ...)` - Assert two values differ
+- `TEST_ASSERT_NULL(val, msg, ...)` - Assert value is NULL
+- `TEST_ASSERT_NOT_NULL(val, msg, ...)` - Assert value is not NULL
+
+### Documentation
+
+- Add Doxygen comments for public APIs
+- Update release notes in `doc/guides/rel_notes/` for important changes
+- Code and documentation must be updated atomically in same patch
+- Only update the **current release** notes file
+- Documentation must match the code
+- PMD features must match the features matrix in `doc/guides/nics/features/`
+- Documentation must match device operations (see `doc/guides/nics/features.rst` for the mapping between features, `eth_dev_ops`, and related APIs)
+- Release notes are NOT required for:
+  - Test-only changes (unit tests, functional tests)
+  - Internal APIs and helper functions (not exported to applications)
+  - Internal implementation changes that don't affect public API
+
+### RST Documentation Style
+
+When reviewing `.rst` documentation files, prefer **definition lists**
+over simple bullet lists where each item has a term and a description.
+Definition lists produce better-structured HTML/PDF output and are
+easier to scan.
+
+**When to suggest a definition list:**
+- A bullet list where each item starts with a bold or emphasized term
+  followed by a dash, colon, or long explanation
+- Lists of options, parameters, configuration values, or features
+  where each entry has a name and a description
+- Glossary-style enumerations
+
+**When a simple list is fine (do NOT flag):**
+- Short lists of items without descriptions (e.g., file names, steps)
+- Lists where items are single phrases or sentences with no term/definition structure
+- Enumerated steps in a procedure
+
+**RST definition list syntax:**
+
+```rst
+term 1
+   Description of term 1.
+
+term 2
+   Description of term 2.
+   Can span multiple lines.
+```
+
+**Example — flag this pattern:**
+
+```rst
+* **error** - Fail with error (default)
+* **truncate** - Truncate content to fit token limit
+* **summary** - Request high-level summary review
+```
+
+**Suggest rewriting as:**
+
+```rst
+error
+   Fail with error (default).
+
+truncate
+   Truncate content to fit token limit.
+
+summary
+   Request high-level summary review.
+```
+
+This is a **Warning**-level suggestion, not an Error. Do not flag it
+when the existing list structure is appropriate (see "when a simple
+list is fine" above).
+
+### API and Driver Changes
+
+- New APIs must be marked as `__rte_experimental`
+- New APIs must have hooks in `app/testpmd` and tests in the functional test suite
+- Changes to existing APIs require release notes
+- New drivers or subsystems must have release notes
+- Internal APIs (used only within DPDK, not exported to applications) do NOT require release notes
+
+### ABI Compatibility and Symbol Exports
+
+**IMPORTANT**: DPDK uses automatic symbol map generation. Do **NOT** recommend
+manually editing `version.map` files - they are auto-generated from source code
+annotations.
+
+#### Symbol Export Macros
+
+New public functions must be annotated with export macros (defined in
+`rte_export.h`). Place the macro on the line immediately before the function
+definition in the `.c` file:
+
+```c
+/* For stable ABI symbols */
+RTE_EXPORT_SYMBOL(rte_foo_create)
+int
+rte_foo_create(struct rte_foo_config *config)
+{
+    /* ... */
+}
+
+/* For experimental symbols (include version when first added) */
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_foo_new_feature, 25.03)
+__rte_experimental
+int
+rte_foo_new_feature(void)
+{
+    /* ... */
+}
+
+/* For internal symbols (shared between DPDK components only) */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_foo_internal_helper)
+int
+rte_foo_internal_helper(void)
+{
+    /* ... */
+}
+```
+
+#### Symbol Export Rules
+
+- `RTE_EXPORT_SYMBOL` - Use for stable ABI functions
+- `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, ver)` - Use for new experimental APIs
+  (version is the DPDK release, e.g., `25.03`)
+- `RTE_EXPORT_INTERNAL_SYMBOL` - Use for functions shared between DPDK libs/drivers
+  but not part of public API
+- Export macros go in `.c` files, not headers
+- The build system generates linker version maps automatically
+
+#### What NOT to Review
+
+- Do **NOT** flag missing `version.map` updates - maps are auto-generated
+- Do **NOT** suggest adding symbols to `lib/*/version.map` files
+
+#### ABI Versioning for Changed Functions
+
+When changing the signature of an existing stable function, use versioning macros
+from `rte_function_versioning.h`:
+
+- `RTE_VERSION_SYMBOL` - Create versioned symbol for backward compatibility
+- `RTE_DEFAULT_SYMBOL` - Mark the new default version
+
+Follow ABI policy and versioning guidelines in the contributor documentation.
+Enable ABI checks with `DPDK_ABI_REF_VERSION` environment variable.
+
+---
+
+## LTS (Long Term Stable) Release Review
+
+LTS releases are DPDK versions ending in `.11` (e.g., 23.11, 22.11,
+21.11, 20.11, 19.11). When reviewing patches targeting an LTS branch,
+apply stricter criteria:
+
+### LTS-Specific Rules
+
+- **Only bug fixes allowed** -- no new features
+- **No new APIs** (experimental or stable)
+- **ABI must remain unchanged** -- no symbol additions, removals,
+  or signature changes
+- Backported fixes should reference the original commit with a
+  `Fixes:` tag
+- Copyright years should reflect when the code was originally
+  written
+- Be conservative: reject changes that are not clearly bug fixes
+
+### What to Flag on LTS Branches
+
+**Error:**
+- New feature code (new functions, new driver capabilities)
+- New experimental or stable API additions
+- ABI changes (new or removed symbols, changed function signatures)
+- Changes that add new configuration options or parameters
+
+**Warning:**
+- Large refactoring that goes beyond what is needed for a fix
+- Missing `Fixes:` tag on a backported bug fix
+- Missing `Cc: stable@dpdk.org`
+
+### When LTS Rules Apply
+
+LTS rules apply when the reviewer is told the target release is an
+LTS version (via the `--release` option or equivalent). If no
+release is specified, assume the patch targets the main development
+branch where new features and APIs are allowed.
+
+---
+
+## Patch Validation Checklist
+
+### Commit Message and License
+
+Checked by `devtools/checkpatches.sh` -- not duplicated here.
+
+### Code Style
+
+- [ ] Lines <=100 characters
+- [ ] Hard tabs for indentation, spaces for alignment
+- [ ] No trailing whitespace
+- [ ] Proper include order
+- [ ] Header guards present
+- [ ] `rte_`/`RTE_` prefix on external symbols
+- [ ] Driver/library global variables use unique prefixes (e.g., `virtio_`, `mlx5_`)
+- [ ] No prohibited terminology
+- [ ] Proper brace style
+- [ ] Function return type on own line
+- [ ] Explicit comparisons: `== NULL`, `== 0`, `!= NULL`, `!= 0`
+- [ ] No forbidden tokens (see table above)
+- [ ] No unnecessary code patterns (see section above)
+- [ ] No usage of deprecated APIs, macros, or functions
+- [ ] Process-shared primitives in shared memory use `PTHREAD_PROCESS_SHARED`
+- [ ] `mmap()` return checked against `MAP_FAILED`, not `NULL`
+- [ ] Statistics use `+=` not `=` for accumulation
+- [ ] Integer multiplies widened before operation when result is 64-bit
+- [ ] Descriptor chain traversals bounded by ring size or loop counter
+- [ ] 64-bit bitmasks use `1ULL <<` or `RTE_BIT64()`, not `1 <<`
+- [ ] Left shifts of `uint8_t`/`uint16_t` cast to unsigned target width before shift when result is 64-bit
+- [ ] No unconditional variable overwrites before read
+- [ ] Nested loops use distinct counter variables
+- [ ] No `memcpy`/`memcmp` with identical source and destination pointers
+- [ ] `rte_mbuf_raw_free_bulk()` not used on mixed-pool mbuf arrays (Tx paths, ring dequeue, error paths)
+- [ ] MTU not confused with frame length (MTU = L3 payload, frame = MTU + L2 overhead)
+- [ ] PMDs read `dev->data->mtu` after configure, not `dev_conf.rxmode.mtu`
+- [ ] Ethernet overhead not hardcoded -- derived from device capabilities
+- [ ] Scatter Rx enabled or error returned when frame length exceeds single mbuf data size
+- [ ] `mtu_set` allows large MTU when scatter Rx is active; re-selects Rx burst function
+- [ ] Rx queue setup selects scattered Rx function when frame length exceeds mbuf
+- [ ] Static function pointer arrays declared `const` when contents are compile-time fixed
+- [ ] `bool` used for pure true/false variables, parameters, and predicate return types
+- [ ] Shared variables use `rte_atomic_*_explicit()`, not `volatile` or bare access
+- [ ] No `__atomic_*()` GCC built-ins or `__ATOMIC_*` ordering constants (use `rte_atomic_*_explicit()` and `rte_memory_order_*`)
+- [ ] No `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` (use `rte_atomic_thread_fence()`)
+- [ ] Memory ordering is the weakest correct choice (`relaxed` for counters, `acquire`/`release` for publish/consume)
+- [ ] Sensitive data cleared with `explicit_bzero()`/`rte_free_sensitive()`, not `memset()`
+
+### API Tags
+
+- [ ] `__rte_experimental` alone on line, only in headers
+- [ ] `__rte_internal` alone on line, only in headers
+- [ ] Alignment attributes only on struct/union
+- [ ] Packed attributes properly paired
+- [ ] New public functions have `RTE_EXPORT_*` macro in `.c` file
+- [ ] Experimental functions use `RTE_EXPORT_EXPERIMENTAL_SYMBOL(name, version)`
+
+### Structure
+
+- [ ] Each commit compiles independently
+- [ ] Code and docs updated together
+- [ ] Documentation matches code behavior
+- [ ] RST docs use definition lists for term/description patterns
+- [ ] PMD features match `doc/guides/nics/features/` matrix
+- [ ] Device operations match documentation (per `features.rst` mappings)
+- [ ] Tests added/updated as needed
+- [ ] Functional tests use TEST_ASSERT macros and unit_test_suite_runner
+- [ ] New APIs marked as `__rte_experimental`
+- [ ] New APIs have testpmd hooks and functional tests
+- [ ] Current release notes updated for significant changes
+- [ ] Release notes updated for API changes
+- [ ] Release notes updated for new drivers or subsystems
+
+---
+
+## Meson Build Files
+
+### Style Requirements
+
+- 4-space indentation (no tabs)
+- Line continuations double-indented
+- Lists alphabetically ordered
+- Short lists (<=3 items): single line, no trailing comma
+- Long lists: one item per line, trailing comma on last item
+- No strict line length limit for meson files; lines under 100 characters are acceptable
+
+```python
+# Short list
+sources = files('file1.c', 'file2.c')
+
+# Long list
+headers = files(
+	'header1.h',
+	'header2.h',
+	'header3.h',
+)
+```
+
+---
+
+## Python Code
+
+- Must comply with formatting standards
+- Use **`black`** for code formatting validation
+- Line length acceptable up to 100 characters
+
+---
+
+## Validation Tools
+
+Run these before submitting:
+
+```bash
+# Check commit messages
+devtools/check-git-log.sh -n1
+
+# Check patch format and forbidden tokens
+devtools/checkpatches.sh -n1
+
+# Check maintainers coverage
+devtools/check-maintainers.sh
+
+# Build validation
+devtools/test-meson-builds.sh
+
+# Find maintainers for your patch
+devtools/get-maintainer.sh <patch-file>
+```
+
+---
+
+## Severity Levels for AI Review
+
+**Error** (must fix):
+
+*Correctness bugs (highest value findings):*
+- Use-after-free
+- Resource leaks on error paths (memory, file descriptors, locks)
+- Double-free or double-close
+- NULL pointer dereference on reachable code path
+- Buffer overflow or out-of-bounds access
+- Missing error check on a function that can fail, leading to undefined behavior
+- Race condition on shared mutable state without synchronization
+- `volatile` used instead of atomics for inter-thread shared variables
+- `__atomic_*()` GCC built-ins in new code (must use `rte_atomic_*_explicit()`)
+- `rte_smp_mb()`/`rte_smp_rmb()`/`rte_smp_wmb()` in new code (must use `rte_atomic_thread_fence()`)
+- Error path that skips necessary cleanup
+- `mmap()` return value checked against NULL instead of `MAP_FAILED`
+- Statistics accumulation using `=` instead of `+=` (overwrite vs increment)
+- Integer multiply without widening cast losing upper bits (16×16, 32×32, etc.)
+- Unbounded descriptor chain traversal on guest/API-supplied indices
+- `1 << n` used for 64-bit bitmask (undefined behavior if n >= 32)
+- Left shift of `uint8_t`/`uint16_t` used in 64-bit context without widening cast (sign extension)
+- Variable assigned then unconditionally overwritten before read
+- Same variable used as counter in nested loops
+- `memcpy`/`memcmp` with same pointer as both arguments (UB or no-op logic error)
+- `rte_mbuf_raw_free_bulk()` on mbuf array where mbufs may come from different pools (Tx burst, ring dequeue)
+- MTU used where frame length is needed or vice versa (off by L2 overhead)
+- `dev_conf.rxmode.mtu` read after configure instead of `dev->data->mtu` (stale value)
+- MTU accepted without scatter Rx when frame size exceeds single mbuf capacity (silent truncation/drop)
+- `mtu_set` rejects valid MTU when scatter Rx is already enabled
+- Rx function selection ignores `scattered_rx` flag or MTU-vs-mbuf-size comparison
+
+*Process and format errors:*
+- Forbidden tokens in code
+- `__rte_experimental`/`__rte_internal` in .c files or not alone on line
+- Compilation failures
+- ABI breaks without proper versioning
+- pthread mutex/cond/rwlock in shared memory without `PTHREAD_PROCESS_SHARED`
+
+*API design errors (new libraries only):*
+- Ops/callback struct with 20+ function pointers in an installed header
+- Callback struct members with no Doxygen documentation
+- Void-returning callbacks for failable operations (errors silently swallowed)
+
+**Warning** (should fix):
+- Missing Cc: stable@dpdk.org for fixes
+- Documentation gaps
+- Documentation does not match code behavior
+- PMD features missing from `doc/guides/nics/features/` matrix
+- Device operations not documented per `features.rst` mappings
+- Missing tests
+- Functional tests not using TEST_ASSERT macros or unit_test_suite_runner
+- New API not marked as `__rte_experimental`
+- New API without testpmd hooks or functional tests
+- New public function missing `RTE_EXPORT_*` macro
+- API changes without release notes
+- New drivers or subsystems without release notes
+- Implicit comparisons (`!ptr` instead of `ptr == NULL`)
+- Unnecessary variable initialization
+- Unnecessary casts of `void *`
+- Unnecessary NULL checks before free
+- Inappropriate use of `rte_malloc()` or `rte_memcpy()`
+- Use of `perror()`, `printf()`, `fprintf()` in libraries or drivers (allowed in examples and test code)
+- Driver/library global variables without unique prefixes (static linking clash risk)
+- Usage of deprecated APIs, macros, or functions in new code
+- RST documentation using bullet lists where definition lists would be more appropriate
+- Ops/callback struct with >5 function pointers in an installed header (ABI risk)
+- New API using fixed enum+union where TLV pattern would be more extensible
+- Installed header labeled "private" or "internal" in meson.build
+- New library using global singleton instead of handle-based API
+- Static function pointer array not declared `const` when contents are compile-time constant
+- `int` used instead of `bool` for variables or return values that are purely true/false
+- `rte_memory_order_seq_cst` used where weaker ordering (`relaxed`, `acquire`/`release`) suffices
+- Standalone `rte_atomic_thread_fence()` where ordering on the atomic operation itself would be clearer
+- `getenv()` used in a driver or library for runtime configuration instead of devargs
+- Hardcoded Ethernet overhead constant instead of per-device overhead calculation
+- PMD does not advertise `RTE_ETH_RX_OFFLOAD_SCATTER` in `rx_offload_capa` but hardware supports multi-segment Rx
+- PMD `dev_info` reports `max_rx_pktlen` or `max_mtu` inconsistent with each other or with the Ethernet overhead
+- `mtu_set` callback does not re-select the Rx burst function after changing MTU
+
+**Do NOT flag** (common false positives):
+- Missing `version.map` updates (maps are auto-generated from `RTE_EXPORT_*` macros)
+- Suggesting manual edits to any `version.map` file
+- SPDX/copyright format, copyright years, copyright holders (not subject to AI review)
+- Commit message formatting (subject length, punctuation, tag order, case-sensitive terms) -- checked by checkpatch
+- Meson file lines under 100 characters
+- Comparisons using `== 0`, `!= 0`, `== NULL`, `!= NULL` as "implicit" (these ARE explicit)
+- Comparisons wrapped in `likely()` or `unlikely()` macros - these are still explicit if using == or !=
+- Anything you determine is correct (do not mention non-issues or say "No issue here")
+- `REGISTER_FAST_TEST` using `NOHUGE_OK`/`ASAN_OK` macros (this is the correct current format)
+- Missing release notes for test-only changes (unit tests do not require release notes)
+- Missing release notes for internal APIs or helper functions (only public APIs need release notes)
+- Any item you later correct with "(Correction: ...)" or "actually acceptable" - just omit it
+- Vague concerns ("should be verified", "should be checked") - if you're not sure it's wrong, don't flag it
+- Items where you say "which is correct" or "this is correct" - if it's correct, don't mention it at all
+- Items where you conclude "no issue here" or "this is actually correct" - omit these entirely
+- Clean patches in a series - do not include a patch just to say "no issues" or describe what it does
+- Cross-patch compilation dependencies - you cannot determine patch ordering correctness from review
+- Claims that a symbol "was removed in patch N" causing issues in patch M - assume author ordered correctly
+- Any speculation about whether patches will compile when applied in sequence
+- Mutexes/locks in process-private memory (standard `malloc`, stack, static non-shared) - these don't need `PTHREAD_PROCESS_SHARED`
+- Use of `rte_spinlock_t` or `rte_rwlock_t` in shared memory (these work correctly without special init)
+- `volatile` used for MMIO/hardware register access in drivers (this is correct usage)
+- Left shift of `uint8_t`/`uint16_t` where the result is stored in a `uint32_t` or narrower variable and not used in pointer arithmetic or 64-bit context (sign extension cannot occur)
+- `getenv()` used in EAL, examples, app/test, or build/config scripts (only flag in drivers/ and lib/)
+- Reading `rxmode.mtu` inside `rte_eth_dev_configure()` implementation (that is where the user request is consumed)
+- `=` assignment to MTU or frame length fields during initial setup (only flag stale reads of `rxmode.mtu` outside configure)
+- PMDs that auto-enable scatter when MTU exceeds mbuf size (this is the correct pattern)
+- Hardcoded `RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN` as overhead when the PMD does not support VLAN and device info is consistent
+- Tagged frames exceeding 1518 bytes at standard MTU — a single-tagged frame of 1522 bytes is valid at MTU 1500 (the outer VLAN header is L2 overhead, not payload). Note: inner VLAN tags in QinQ *do* consume MTU; see the MTU section for details.
+
+**Info** (consider):
+- Minor style preferences
+- Optimization suggestions
+- Alternative approaches
+
+---
+
+# Response Format
+
+When you identify an issue:
+1. **State the problem** (1 sentence)
+2. **Why it matters** (1 sentence, only if not obvious)
+3. **Suggested fix** (code snippet or specific action)
+
+Example:
+This could panic if the string is NULL.
+
+---
+
+## FINAL CHECK BEFORE SUBMITTING REVIEW
+
+Before outputting your review, do two separate passes:
+
+### Pass 1: Verify correctness bugs are included
+
+Ask: "Did I trace every error path for resource leaks? Did I check
+for use-after-free? Did I verify error codes are propagated?"
+
+If you identified a potential correctness bug but talked yourself
+out of it, **add it back**. It is better to report a possible bug
+than to miss a real one.
+
+### Pass 2: Remove style/process false positives
+
+For EACH style/process item, ask: "Did I conclude this is actually
+fine/correct/acceptable/no issue?"
+
+If YES, DELETE THAT ITEM. It should not be in your output.
+
+An item that says "X is wrong... actually this is correct" is a
+FALSE POSITIVE and must be removed. This applies to style, format,
+and process items only.
+
+**If your Errors section would be empty after this check, that's
+fine -- it means the patches are good.**
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v13 2/6] devtools: add multi-provider AI patch review script
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
@ 2026-04-02 19:44     ` Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-02 19:44 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

This is an AI generated script to review DPDK patches against
the AGENTS.md coding guidelines using AI language models.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

The script reads a patch file and the AGENTS.md guidelines, then
submits them to the selected AI provider for review. Results are
organized by severity level (Error, Warning, Info) as defined in
the guidelines.

Features:
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Verbose mode shows token usage statistics
  - Uses temporary files for API requests to handle large patches
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/analyze-patch.py 0001-net-ixgbe-fix-something.patch
  ./devtools/analyze-patch.py -p xai my-patch.patch
  ./devtools/analyze-patch.py -l  # list providers

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/analyze-patch.py | 1603 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1603 insertions(+)
 create mode 100755 devtools/analyze-patch.py

diff --git a/devtools/analyze-patch.py b/devtools/analyze-patch.py
new file mode 100755
index 0000000000..83c05689af
--- /dev/null
+++ b/devtools/analyze-patch.py
@@ -0,0 +1,1603 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Analyze DPDK patches using AI providers.
+
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+import time
+from dataclasses import dataclass, field
+from datetime import date
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any, Iterator, NoReturn
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Output formats
+OUTPUT_FORMATS = ["text", "markdown", "html", "json"]
+
+# Large file handling modes
+LARGE_FILE_MODES = ["error", "truncate", "chunk", "commits-only", "summary"]
+
+# Approximate characters per token (conservative: fewer chars = higher estimate)
+CHARS_PER_TOKEN = 3.5
+
+# Default token limits by provider (leaving room for system prompt and response)
+PROVIDER_INPUT_LIMITS = {
+    "anthropic": 180000,  # 200K context, reserve for system/response
+    "openai": 900000,  # GPT-4.1 has 1M context
+    "xai": 1800000,  # Grok 4.1 Fast has 2M context
+    "google": 900000,  # Gemini 3 Flash has 1M context
+}
+
+
+@dataclass
+class TokenUsage:
+    """Accumulated token usage across API calls."""
+
+    input_tokens: int = 0
+    output_tokens: int = 0
+    cache_creation_tokens: int = 0
+    cache_read_tokens: int = 0
+    api_calls: int = 0
+
+    def add(self, other: "TokenUsage") -> None:
+        """Accumulate usage from another TokenUsage."""
+        self.input_tokens += other.input_tokens
+        self.output_tokens += other.output_tokens
+        self.cache_creation_tokens += other.cache_creation_tokens
+        self.cache_read_tokens += other.cache_read_tokens
+        self.api_calls += other.api_calls
+
+
+# Pricing per million tokens (USD) - update as prices change.
+# Keys are (provider, model-prefix) tuples; first prefix match wins.
+# "default" key is fallback for unknown models within a provider.
+PRICING: dict[str, dict[str, dict[str, float]]] = {
+    "anthropic": {
+        "claude-opus-4": {
+            "input": 15.0, "output": 75.0,
+            "cache_write": 18.75, "cache_read": 1.50,
+        },
+        "claude-sonnet-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+        "claude-haiku-4": {
+            "input": 0.80, "output": 4.0,
+            "cache_write": 1.0, "cache_read": 0.08,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+    },
+    "openai": {
+        "gpt-4.1": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+        "gpt-4.1-mini": {
+            "input": 0.40, "output": 1.60,
+            "cache_write": 0.40, "cache_read": 0.10,
+        },
+        "gpt-4.1-nano": {
+            "input": 0.10, "output": 0.40,
+            "cache_write": 0.10, "cache_read": 0.025,
+        },
+        "default": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+    },
+    "xai": {
+        "grok-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+    },
+    "google": {
+        "gemini-3-flash": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+        "default": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+    },
+}
+
+
+def get_pricing(provider: str, model: str) -> dict[str, float]:
+    """Look up per-million-token pricing for a provider/model."""
+    provider_prices = PRICING.get(provider, {})
+    for prefix, prices in provider_prices.items():
+        if prefix != "default" and model.startswith(prefix):
+            return prices
+    return provider_prices.get(
+        "default", {"input": 0, "output": 0, "cache_write": 0, "cache_read": 0}
+    )
+
+
+def estimate_cost(usage: TokenUsage, provider: str, model: str) -> float:
+    """Estimate cost in USD from token usage."""
+    prices = get_pricing(provider, model)
+    cost = 0.0
+    # Regular input tokens: total reported minus any cache-read tokens.
+    # Cache-creation tokens are reported separately and billed at cache_write rate.
+    regular_input = usage.input_tokens - usage.cache_read_tokens
+    cost += regular_input * prices.get("input", 0) / 1_000_000
+    cost += usage.output_tokens * prices.get("output", 0) / 1_000_000
+    cost += usage.cache_creation_tokens * prices.get("cache_write", 0) / 1_000_000
+    cost += usage.cache_read_tokens * prices.get("cache_read", 0) / 1_000_000
+    return cost
+
+
+def format_token_summary(
+    usage: TokenUsage, provider: str, model: str, show_costs: bool
+) -> str:
+    """Format a token usage summary string."""
+    lines = ["=== Token Usage Summary ==="]
+    lines.append(f"API calls:     {usage.api_calls}")
+    lines.append(f"Input tokens:  {usage.input_tokens:,}")
+    lines.append(f"Output tokens: {usage.output_tokens:,}")
+    if usage.cache_creation_tokens:
+        lines.append(f"Cache write:   {usage.cache_creation_tokens:,}")
+    if usage.cache_read_tokens:
+        lines.append(f"Cache read:    {usage.cache_read_tokens:,}")
+    total = usage.input_tokens + usage.output_tokens
+    lines.append(f"Total tokens:  {total:,}")
+    if show_costs:
+        cost = estimate_cost(usage, provider, model)
+        lines.append(f"Est. cost:     ${cost:.4f}")
+    lines.append("=" * 27)
+    return "\n".join(lines)
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+# LTS releases: any DPDK release with minor version .11
+# (e.g., 19.11, 20.11, 21.11, 22.11, 23.11, 24.11, 25.11, ...)
+
+SYSTEM_PROMPT_BASE = """\
+You are an expert DPDK code reviewer. Analyze patches for compliance with \
+DPDK coding standards and contribution guidelines. Provide clear, actionable \
+feedback organized by severity (Error, Warning, Info) as defined in the \
+guidelines."""
+
+LTS_RULES = """
+LTS (Long Term Stable) branch rules apply:
+- Only bug fixes allowed, no new features
+- No new APIs (experimental or stable)
+- ABI must remain unchanged
+- Backported fixes should reference the original commit with Fixes: tag
+- Copyright years should reflect when the code was originally written
+- Be conservative: reject changes that aren't clearly bug fixes"""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """Provide your review in plain text format.""",
+    "markdown": """Provide your review in Markdown format with:
+- Headers (##) for each severity level (Errors, Warnings, Info)
+- Bullet points for individual issues
+- Code blocks (```) for code references
+- Bold (**) for emphasis on key points""",
+    "html": """Provide your review in HTML format with:
+- <h2> tags for each severity level (Errors, Warnings, Info)
+- <ul>/<li> for individual issues
+- <pre><code> for code references
+- <strong> for emphasis on key points
+- Use appropriate semantic HTML tags
+- Do NOT include <html>, <head>, or <body> tags - just the content""",
+    "json": """Provide your review in JSON format with this structure:
+{
+  "summary": "Brief one-line summary of the review",
+  "errors": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "warnings": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "info": [
+    {"issue": "description", "location": "file:line", "suggestion": "fix"}
+  ],
+  "passed_checks": ["list of checks that passed"],
+  "overall_status": "PASS|WARN|FAIL"
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """Please review the following DPDK patch file '{patch_name}' \
+against the AGENTS.md guidelines. Focus on:
+
+1. Correctness bugs (resource leaks, use-after-free, race conditions, etc.)
+2. C coding style (forbidden tokens, implicit comparisons, unnecessary patterns)
+3. API and documentation requirements
+4. Any other guideline violations
+
+Note: commit message formatting and SPDX/copyright compliance are checked \
+by checkpatches.sh and should NOT be flagged here.
+
+{format_instruction}
+
+--- PATCH CONTENT ---
+"""
+
+
+def error(msg: str) -> NoReturn:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def is_lts_release(release: str | None) -> bool:
+    """Check if a release is an LTS release.
+
+    Per DPDK project guidelines, any release with minor version .11
+    is an LTS release (e.g., 19.11, 21.11, 23.11, 24.11, 25.11).
+    """
+    if not release:
+        return False
+    # Check for explicit -lts suffix
+    if "-lts" in release.lower():
+        return True
+    # Extract base version (e.g., "23.11" from "23.11.1" or "23.11-rc1")
+    version = release.split("-")[0]
+    parts = version.split(".")
+    if len(parts) >= 2:
+        try:
+            minor = int(parts[1])
+            return minor == 11
+        except ValueError:
+            pass
+    return False
+
+
+def estimate_tokens(text: str) -> int:
+    """Estimate token count from text length."""
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def split_mbox_patches(content: str) -> list[str]:
+    """Split an mbox file into individual patches."""
+    patches = []
+    current_patch = []
+    in_patch = False
+
+    for line in content.split("\n"):
+        # Detect start of new message in mbox format
+        # git-format-patch: "From <40-char-hex> Mon Sep 17 00:00:00 2001"
+        # general mbox: "From <addr> <day-of-week> ..."
+        if line.startswith("From ") and (
+            re.match(r"^From [0-9a-f]{40} ", line)
+            or " Mon " in line
+            or " Tue " in line
+            or " Wed " in line
+            or " Thu " in line
+            or " Fri " in line
+            or " Sat " in line
+            or " Sun " in line
+        ):
+            if current_patch:
+                patches.append("\n".join(current_patch))
+            current_patch = [line]
+            in_patch = True
+        elif in_patch:
+            current_patch.append(line)
+
+    # Don't forget the last patch
+    if current_patch:
+        patches.append("\n".join(current_patch))
+
+    return patches if patches else [content]
+
+
+def extract_commit_messages(content: str) -> str:
+    """Extract only commit messages from patch content."""
+    patches = split_mbox_patches(content)
+    messages = []
+
+    for patch in patches:
+        lines = patch.split("\n")
+        msg_lines = []
+        in_headers = True
+        in_body = False
+        found_subject = False
+
+        for line in lines:
+            # Collect headers we care about
+            if in_headers:
+                if line.startswith("Subject:"):
+                    msg_lines.append(line)
+                    found_subject = True
+                elif line.startswith(("From:", "Date:")):
+                    msg_lines.append(line)
+                elif line.startswith((" ", "\t")) and found_subject:
+                    # Subject continuation
+                    msg_lines.append(line)
+                elif line == "":
+                    if found_subject:
+                        in_headers = False
+                        in_body = True
+                        msg_lines.append("")
+            elif in_body:
+                # Stop at the diffstat separator or diff
+                if line.rstrip() == "---":
+                    break
+                if line.startswith("diff --git"):
+                    break
+                msg_lines.append(line)
+
+        if msg_lines:
+            messages.append("\n".join(msg_lines))
+
+    return "\n\n---\n\n".join(messages)
+
+
+def truncate_content(
+    content: str, max_tokens: float
+) -> tuple[str, bool]:
+    """Truncate content to fit within token limit."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN)
+
+    if len(content) <= max_chars:
+        return content, False
+
+    # Try to truncate at a reasonable boundary
+    truncated = content[:max_chars]
+
+    # Find last complete diff hunk or patch boundary
+    last_diff = truncated.rfind("\ndiff --git")
+    last_patch = truncated.rfind("\nFrom ")
+
+    if last_diff > max_chars * 0.5:
+        truncated = truncated[:last_diff]
+    elif last_patch > max_chars * 0.5:
+        truncated = truncated[:last_patch]
+
+    truncated += "\n\n[... Content truncated due to size limits ...]\n"
+    return truncated, True
+
+
+def chunk_content(
+    content: str, max_tokens: int
+) -> Iterator[tuple[str, int, int]]:
+    """Split content into chunks that fit within token limit.
+
+    Yields tuples of (chunk_content, chunk_number, total_chunks).
+    """
+    patches = split_mbox_patches(content)
+
+    if len(patches) == 1:
+        # Single large patch - split by diff sections
+        yield from chunk_single_patch(content, max_tokens)
+        return
+
+    # Multiple patches - group them to fit within limits
+    chunks = []
+    current_chunk = []
+    current_size = 0
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)  # 90% to leave margin
+
+    for patch in patches:
+        patch_size = len(patch)
+        if current_size + patch_size > max_chars and current_chunk:
+            chunks.append("\n".join(current_chunk))
+            current_chunk = []
+            current_size = 0
+
+        if patch_size > max_chars:
+            # Single patch too large, truncate it
+            if current_chunk:
+                chunks.append("\n".join(current_chunk))
+                current_chunk = []
+                current_size = 0
+            truncated, _ = truncate_content(patch, max_tokens * 0.9)
+            chunks.append(truncated)
+        else:
+            current_chunk.append(patch)
+            current_size += patch_size
+
+    if current_chunk:
+        chunks.append("\n".join(current_chunk))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def chunk_single_patch(content: str, max_tokens: int) -> Iterator[tuple[str, int, int]]:
+    """Split a single large patch by diff sections."""
+    max_chars = int(max_tokens * CHARS_PER_TOKEN * 0.9)
+
+    # Extract header (everything before first diff)
+    first_diff = content.find("\ndiff --git")
+    if first_diff == -1:
+        # No diff sections, just truncate
+        truncated, _ = truncate_content(content, max_tokens * 0.9)
+        yield truncated, 1, 1
+        return
+
+    header = content[: first_diff + 1]
+    diff_content = content[first_diff + 1 :]
+
+    # Split by diff sections
+    diffs = []
+    current_diff = []
+    for line in diff_content.split("\n"):
+        if line.startswith("diff --git") and current_diff:
+            diffs.append("\n".join(current_diff))
+            current_diff = []
+        current_diff.append(line)
+    if current_diff:
+        diffs.append("\n".join(current_diff))
+
+    # Group diffs into chunks
+    chunks = []
+    current_chunk_diffs = []
+    current_size = len(header)
+
+    for diff in diffs:
+        diff_size = len(diff)
+        if current_size + diff_size > max_chars and current_chunk_diffs:
+            chunks.append(header + "\n".join(current_chunk_diffs))
+            current_chunk_diffs = []
+            current_size = len(header)
+
+        if diff_size + len(header) > max_chars:
+            # Single diff too large
+            if current_chunk_diffs:
+                chunks.append(header + "\n".join(current_chunk_diffs))
+                current_chunk_diffs = []
+            truncated_diff = diff[: max_chars - len(header) - 100]
+            truncated_diff += "\n[... diff truncated ...]\n"
+            chunks.append(header + truncated_diff)
+            current_size = len(header)
+        else:
+            current_chunk_diffs.append(diff)
+            current_size += diff_size
+
+    if current_chunk_diffs:
+        chunks.append(header + "\n".join(current_chunk_diffs))
+
+    total = len(chunks)
+    for i, chunk in enumerate(chunks, 1):
+        yield chunk, i, total
+
+
+def get_summary_prompt() -> str:
+    """Get prompt modifications for summary mode."""
+    return """
+NOTE: This is a LARGE patch series. Provide a HIGH-LEVEL summary review only:
+- Focus on overall architecture and design concerns
+- Check commit message formatting across the series
+- Identify any obvious policy violations
+- Do NOT attempt detailed line-by-line code review
+- Summarize the scope and purpose of the changes
+"""
+
+
+def format_combined_reviews(
+    reviews: list[tuple[str, str]], output_format: str, patch_name: str
+) -> str:
+    """Combine multiple chunk/patch reviews into a single output."""
+    if output_format == "json":
+        combined = {
+            "patch_file": patch_name,
+            "sections": [
+                {"label": label, "review": review} for label, review in reviews
+            ],
+        }
+        return json.dumps(combined, indent=2)
+    elif output_format == "html":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"<h2>{label}</h2>\n{review}")
+        return "\n<hr>\n".join(sections)
+    elif output_format == "markdown":
+        sections = []
+        for label, review in reviews:
+            sections.append(f"## {label}\n\n{review}")
+        return "\n\n---\n\n".join(sections)
+    else:  # text
+        sections = []
+        for label, review in reviews:
+            sections.append(f"=== {label} ===\n\n{review}")
+        separator = "\n\n" + "=" * 60 + "\n\n"
+        return separator.join(sections)
+
+
+def build_system_prompt(review_date: str, release: str | None) -> str:
+    """Build system prompt with date and release context."""
+    prompt = SYSTEM_PROMPT_BASE
+    prompt += f"\n\nCurrent date: {review_date}."
+
+    if release:
+        prompt += f"\nTarget DPDK release: {release}."
+        if is_lts_release(release):
+            prompt += LTS_RULES
+        else:
+            prompt += "\nThis is a main branch or standard release."
+            prompt += "\nNew features and experimental APIs are allowed."
+
+    return prompt
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": system_prompt},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": system_prompt},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + patch_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    user_prompt = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    return {
+        "systemInstruction": {
+            "parts": [
+                {"text": system_prompt},
+                {"text": agents_content},
+            ]
+        },
+        "contents": [
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + patch_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    system_prompt: str,
+    agents_content: str,
+    patch_content: str,
+    patch_name: str,
+    output_format: str = "text",
+    verbose: bool = False,
+    timeout: int = 300,
+) -> tuple[str, TokenUsage]:
+    """Make API request to the specified provider.
+
+    Returns a tuple of (response_text, token_usage).
+    """
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-goog-api-key": api_key,
+        }
+        url = f"{config['endpoint']}/{model}:generateContent"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            output_format,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request with retries for transient errors
+    request_body = json.dumps(request_data).encode("utf-8")
+    max_retries = 3
+    result = None
+
+    for attempt in range(max_retries + 1):
+        req = Request(url, data=request_body, headers=headers, method="POST")
+        try:
+            with urlopen(req, timeout=timeout) as response:
+                result = json.loads(response.read().decode("utf-8"))
+            break  # Success
+        except HTTPError as e:
+            if e.code in (429, 503, 529) and attempt < max_retries:
+                delay = 2 ** (attempt + 1)  # 2, 4, 8 seconds
+                # Check for Retry-After header
+                retry_after = e.headers.get("Retry-After")
+                if retry_after:
+                    try:
+                        delay = max(delay, int(retry_after))
+                    except ValueError:
+                        pass
+                print(
+                    f"API returned {e.code}, retrying in {delay}s "
+                    f"(attempt {attempt + 1}/{max_retries})...",
+                    file=sys.stderr,
+                )
+                e.read()  # Drain the response body
+                time.sleep(delay)
+                continue
+            error_body = e.read().decode("utf-8")
+            try:
+                error_data = json.loads(error_body)
+                error(f"API error: {error_data.get('error', error_body)}")
+            except json.JSONDecodeError:
+                error(f"API error ({e.code}): {error_body}")
+        except URLError as e:
+            if isinstance(e.reason, TimeoutError):
+                error(f"API request timed out after {timeout} seconds")
+            error(f"Connection error: {e.reason}")
+        except TimeoutError:
+            error(f"API request timed out after {timeout} seconds")
+
+    if result is None:
+        error("API request failed after all retries")
+
+    # Extract token usage
+    usage = TokenUsage(api_calls=1)
+    if provider == "anthropic":
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("input_tokens", 0)
+        usage.output_tokens = raw_usage.get("output_tokens", 0)
+        usage.cache_creation_tokens = raw_usage.get(
+            "cache_creation_input_tokens", 0
+        )
+        usage.cache_read_tokens = raw_usage.get("cache_read_input_tokens", 0)
+    elif provider == "google":
+        raw_usage = result.get("usageMetadata", {})
+        usage.input_tokens = raw_usage.get("promptTokenCount", 0)
+        usage.output_tokens = raw_usage.get("candidatesTokenCount", 0)
+    else:  # openai, xai
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("prompt_tokens", 0)
+        usage.output_tokens = raw_usage.get("completion_tokens", 0)
+        # OpenAI cache details (if available)
+        cache_details = raw_usage.get("prompt_tokens_details", {})
+        if cache_details:
+            usage.cache_read_tokens = cache_details.get("cached_tokens", 0)
+
+    # Show per-call details in verbose mode
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        print(f"Input tokens: {usage.input_tokens:,}", file=sys.stderr)
+        print(f"Output tokens: {usage.output_tokens:,}", file=sys.stderr)
+        if usage.cache_creation_tokens:
+            print(
+                f"Cache creation: {usage.cache_creation_tokens:,}",
+                file=sys.stderr,
+            )
+        if usage.cache_read_tokens:
+            print(
+                f"Cache read: {usage.cache_read_tokens:,}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        text = "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+        return text, usage
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        text = "".join(part.get("text", "") for part in parts)
+        return text, usage
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        text = choices[0].get("message", {}).get("content", "")
+        return text, usage
+
+
+def get_last_message_id(patch_content: str) -> str | None:
+    """Extract Message-ID from the last patch in an mbox."""
+    msg_ids = re.findall(
+        r"^Message-I[Dd]:\s*(.+)$", patch_content, re.MULTILINE | re.IGNORECASE
+    )
+    if msg_ids:
+        msg_id = msg_ids[-1].strip()
+        # Normalize: remove < > and add them back
+        msg_id = msg_id.strip("<>")
+        return f"<{msg_id}>"
+    return None
+
+
+def get_last_subject(patch_content: str) -> str | None:
+    """Extract subject from the last patch in an mbox."""
+    # Find all Subject lines with potential continuations
+    subjects = []
+    lines = patch_content.split("\n")
+    i = 0
+    while i < len(lines):
+        if lines[i].lower().startswith("subject:"):
+            subject = lines[i][8:].strip()
+            i += 1
+            # Handle continuation lines (RFC 2822 folding)
+            while i < len(lines) and lines[i].startswith((" ", "\t")):
+                subject += " " + lines[i].strip()
+                i += 1
+            subjects.append(subject)
+        else:
+            i += 1
+    return subjects[-1] if subjects else None
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+) -> bool:
+    """Send review email using git send-email, sendmail, or msmtp."""
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    email_text = msg.as_string()
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(email_text, file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return True
+
+    # Write to temp file for git send-email
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".eml", delete=False) as f:
+        f.write(email_text)
+        temp_file = f.name
+
+    try:
+        # Try git send-email first
+        if get_git_config("sendemail.smtpserver"):
+            # Build command with all arguments
+            flat_cmd = ["git", "send-email", "--confirm=never", "--quiet"]
+            for addr in to_addrs:
+                flat_cmd.extend(["--to", addr])
+            for addr in cc_addrs:
+                flat_cmd.extend(["--cc", addr])
+            if from_addr:
+                flat_cmd.extend(["--from", from_addr])
+            if in_reply_to:
+                flat_cmd.extend(["--in-reply-to", in_reply_to])
+            flat_cmd.append(temp_file)
+
+            try:
+                subprocess.run(flat_cmd, check=True, capture_output=True)
+                print("Email sent via git send-email", file=sys.stderr)
+                return True
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+
+        # Try sendmail
+        try:
+            subprocess.run(
+                ["sendmail", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via sendmail", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        # Try msmtp
+        try:
+            subprocess.run(
+                ["msmtp", "-t"],
+                input=email_text,
+                text=True,
+                capture_output=True,
+                check=True,
+            )
+            print("Email sent via msmtp", file=sys.stderr)
+            return True
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            pass
+
+        error("Could not send email. Configure git send-email, sendmail, or msmtp.")
+
+    finally:
+        os.unlink(temp_file)
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Analyze DPDK patches using AI providers",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s patch.patch                    # Review with default settings
+    %(prog)s -p openai my-patch.patch       # Use OpenAI ChatGPT
+    %(prog)s -f markdown patch.patch        # Output as Markdown
+    %(prog)s -f json -o review.json patch.patch  # Save JSON to file
+    %(prog)s -f html -o review.html patch.patch  # Save HTML to file
+    %(prog)s -r 24.11 patch.patch           # Review for specific release
+    %(prog)s -r 24.11-lts patch.patch       # Review for LTS branch
+    %(prog)s --send-email --to dev@dpdk.org series.mbox
+    %(prog)s --send-email --to dev@dpdk.org --dry-run series.mbox
+
+Large File Handling:
+    %(prog)s --split-patches series.mbox    # Review each patch separately
+    %(prog)s --split-patches --patch-range 1-5 series.mbox  # Review patches 1-5
+    %(prog)s --large-file=truncate patch.mbox   # Truncate to fit limit
+    %(prog)s --large-file=commits-only series.mbox  # Review commit messages only
+    %(prog)s --large-file=summary series.mbox   # High-level summary only
+    %(prog)s --large-file=chunk series.mbox     # Split and review in chunks
+
+Large File Modes:
+    error       - Fail with error (default)
+    truncate    - Truncate content to fit token limit
+    chunk       - Split into chunks and review each
+    commits-only - Extract and review only commit messages
+    summary     - Request high-level summary review
+
+LTS Releases:
+    Use -r/--release with LTS version (e.g., 24.11-lts, 23.11) to enable
+    stricter review rules: bug fixes only, no new features or APIs.
+    Any DPDK release with minor version .11 is an LTS release.
+
+Token Usage:
+    Token counts are always printed to stderr after each run.
+    %(prog)s -c patch.patch                  # Include estimated cost
+    %(prog)s -c -f json -o r.json patch.patch  # Cost in JSON metadata too
+        """,
+    )
+
+    parser.add_argument("patch_file", nargs="?", help="Patch file to analyze")
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=4096,
+        help="Max tokens for response (default: 4096)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=OUTPUT_FORMATS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output",
+        metavar="FILE",
+        help="Write output to file instead of stdout",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+    parser.add_argument(
+        "-c",
+        "--show-costs",
+        action="store_true",
+        help="Show estimated cost alongside token usage summary",
+    )
+    parser.add_argument(
+        "--timeout",
+        type=int,
+        default=300,
+        metavar="SECONDS",
+        help="API request timeout in seconds (default: 300)",
+    )
+
+    # Date and release options
+    parser.add_argument(
+        "-D",
+        "--date",
+        metavar="YYYY-MM-DD",
+        help="Review date context (default: today)",
+    )
+    parser.add_argument(
+        "-r",
+        "--release",
+        metavar="VERSION",
+        help="Target DPDK release (e.g., 24.11, 23.11-lts)",
+    )
+
+    # Large file handling options
+    large_group = parser.add_argument_group("Large File Handling")
+    large_group.add_argument(
+        "--large-file",
+        choices=LARGE_FILE_MODES,
+        default="error",
+        metavar="MODE",
+        help="How to handle large files: error (default), truncate, "
+        "chunk, commits-only, summary",
+    )
+    large_group.add_argument(
+        "--max-tokens",
+        type=int,
+        metavar="N",
+        help="Max input tokens (default: provider-specific)",
+    )
+    large_group.add_argument(
+        "--split-patches",
+        action="store_true",
+        help="Split mbox into individual patches and review each separately",
+    )
+    large_group.add_argument(
+        "--patch-range",
+        metavar="N-M",
+        help="Review only patches N through M (1-indexed, use with --split-patches)",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check patch file is provided
+    if not args.patch_file:
+        parser.error("patch_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    patch_path = Path(args.patch_file)
+    if not patch_path.exists():
+        error(f"Patch file not found: {args.patch_file}")
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Determine review date
+    review_date = args.date or date.today().isoformat()
+
+    # Build system prompt with date and release context
+    system_prompt = build_system_prompt(review_date, args.release)
+
+    # Read files
+    agents_content = agents_path.read_text()
+    patch_content = patch_path.read_text()
+    patch_name = patch_path.name
+
+    # Determine max tokens for this provider
+    max_input_tokens = args.max_tokens or PROVIDER_INPUT_LIMITS.get(
+        args.provider, 100000
+    )
+
+    # Estimate overhead tokens (system prompt, agents, user prompt template)
+    format_instruction = FORMAT_INSTRUCTIONS.get(args.output_format, "")
+    user_prompt_template = USER_PROMPT.format(
+        patch_name=patch_name, format_instruction=format_instruction
+    )
+    overhead_tokens = estimate_tokens(
+        system_prompt + agents_content + user_prompt_template
+    )
+
+    # Estimate total token count including overhead
+    estimated_tokens = estimate_tokens(patch_content) + overhead_tokens
+
+    # Max tokens available for patch content alone
+    max_patch_tokens = max_input_tokens - overhead_tokens
+
+    # Accumulate token usage across all API calls
+    total_usage = TokenUsage()
+    already_reviewed = False
+
+    # Parse patch range if specified
+    patch_start, patch_end = None, None
+    if args.patch_range:
+        try:
+            if "-" in args.patch_range:
+                start, end = args.patch_range.split("-", 1)
+                patch_start = int(start)
+                patch_end = int(end)
+            else:
+                patch_start = patch_end = int(args.patch_range)
+        except ValueError:
+            error(f"Invalid --patch-range format: {args.patch_range}")
+        if not args.split_patches:
+            print(
+                "Warning: --patch-range has no effect without --split-patches",
+                file=sys.stderr,
+            )
+
+    # Handle --split-patches mode
+    review_text = ""
+    if args.split_patches:
+        patches = split_mbox_patches(patch_content)
+        total_patches = len(patches)
+
+        if total_patches == 1:
+            if patch_start is not None and (patch_start > 1 or patch_end > 1):
+                error(
+                    f"Only 1 patch found in mbox, but --patch-range "
+                    f"{args.patch_range} is out of range"
+                )
+            print(
+                "Note: Only 1 patch found in mbox, --split-patches has no effect",
+                file=sys.stderr,
+            )
+        else:
+            print(
+                f"Found {total_patches} patches in mbox",
+                file=sys.stderr,
+            )
+
+            # Apply patch range filter
+            if patch_start is not None:
+                if patch_start < 1 or patch_start > total_patches:
+                    error(
+                        f"Patch range start {patch_start} out of range (1-{total_patches})"
+                    )
+                if patch_end < patch_start or patch_end > total_patches:
+                    error(
+                        f"Patch range end {patch_end} out of range ({patch_start}-{total_patches})"
+                    )
+                patches = patches[patch_start - 1 : patch_end]
+                print(
+                    f"Reviewing patches {patch_start}-{patch_end} ({len(patches)} patches)",
+                    file=sys.stderr,
+                )
+
+            # Review each patch separately
+            all_reviews = []
+            for i, patch in enumerate(patches, patch_start or 1):
+                patch_label = f"Patch {i}/{total_patches}"
+                print(f"\nReviewing {patch_label}...", file=sys.stderr)
+
+                review_text, call_usage = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    patch,
+                    f"{patch_name} ({patch_label})",
+                    args.output_format,
+                    args.verbose,
+                    args.timeout,
+                )
+                total_usage.add(call_usage)
+                all_reviews.append((patch_label, review_text))
+
+            # Combine reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal API call
+            already_reviewed = True
+
+    # Check if content is too large (skip if already processed via split)
+    is_large = not already_reviewed and estimated_tokens > max_input_tokens
+
+    if is_large:
+        print(
+            f"Warning: Estimated {estimated_tokens:,} tokens exceeds limit of "
+            f"{max_input_tokens:,}",
+            file=sys.stderr,
+        )
+
+        if args.large_file == "error":
+            error(
+                f"Patch file too large ({estimated_tokens:,} tokens). "
+                f"Use --large-file=truncate|chunk|commits-only|summary to handle, "
+                f"or --split-patches to review patches individually."
+            )
+        elif args.large_file == "truncate":
+            print("Truncating content to fit token limit...", file=sys.stderr)
+            patch_content, was_truncated = truncate_content(
+                patch_content, max_patch_tokens
+            )
+            if was_truncated:
+                print("Content was truncated.", file=sys.stderr)
+        elif args.large_file == "commits-only":
+            print("Extracting commit messages only...", file=sys.stderr)
+            patch_content = extract_commit_messages(patch_content)
+            new_estimate = estimate_tokens(patch_content) + overhead_tokens
+            print(
+                f"Reduced to ~{new_estimate:,} tokens (commit messages only)",
+                file=sys.stderr,
+            )
+            if new_estimate > max_input_tokens:
+                patch_content, _ = truncate_content(
+                    patch_content, max_patch_tokens
+                )
+        elif args.large_file == "summary":
+            print("Using summary mode for large patch...", file=sys.stderr)
+            system_prompt += get_summary_prompt()
+            patch_content, _ = truncate_content(
+                patch_content, max_patch_tokens
+            )
+        elif args.large_file == "chunk":
+            print("Processing in chunks...", file=sys.stderr)
+            all_reviews = []
+            for chunk, chunk_num, total_chunks in chunk_content(
+                patch_content, max_patch_tokens
+            ):
+                chunk_label = f"Chunk {chunk_num}/{total_chunks}"
+                print(f"Reviewing {chunk_label}...", file=sys.stderr)
+
+                review_text, call_usage = call_api(
+                    args.provider,
+                    api_key,
+                    model,
+                    args.tokens,
+                    system_prompt,
+                    agents_content,
+                    chunk,
+                    f"{patch_name} ({chunk_label})",
+                    args.output_format,
+                    args.verbose,
+                    args.timeout,
+                )
+                total_usage.add(call_usage)
+                all_reviews.append((chunk_label, review_text))
+
+            # Combine chunk reviews
+            review_text = format_combined_reviews(
+                all_reviews, args.output_format, patch_name
+            )
+
+            # Skip the normal single API call below
+            already_reviewed = True
+
+    if args.verbose:
+        print("=== Request ===", file=sys.stderr)
+        print(f"Provider: {args.provider}", file=sys.stderr)
+        print(f"Model: {model}", file=sys.stderr)
+        print(f"Review date: {review_date}", file=sys.stderr)
+        if args.release:
+            lts_status = " (LTS)" if is_lts_release(args.release) else ""
+            print(f"Target release: {args.release}{lts_status}", file=sys.stderr)
+        print(f"Output format: {args.output_format}", file=sys.stderr)
+        print(f"AGENTS file: {args.agents}", file=sys.stderr)
+        print(f"Patch file: {args.patch_file}", file=sys.stderr)
+        print(f"Estimated tokens: {estimated_tokens:,}", file=sys.stderr)
+        print(f"Max input tokens: {max_input_tokens:,}", file=sys.stderr)
+        if args.large_file != "error":
+            print(f"Large file mode: {args.large_file}", file=sys.stderr)
+        if args.split_patches:
+            print("Split patches: yes", file=sys.stderr)
+        if args.output:
+            print(f"Output file: {args.output}", file=sys.stderr)
+        if args.send_email:
+            print("Send email: yes", file=sys.stderr)
+            print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+            if args.cc_addrs:
+                print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+            print(f"From: {from_addr}", file=sys.stderr)
+        print("===============", file=sys.stderr)
+
+    # Call API (unless already processed via chunks/split)
+    if not already_reviewed:
+        review_text, call_usage = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            system_prompt,
+            agents_content,
+            patch_content,
+            patch_name,
+            args.output_format,
+            args.verbose,
+            args.timeout,
+        )
+        total_usage.add(call_usage)
+
+    if not review_text:
+        error(f"No response received from {args.provider}")
+
+    # Format output based on requested format
+    provider_name = config["name"]
+
+    if args.output_format == "json":
+        # For JSON, try to parse and add metadata
+        try:
+            review_data = json.loads(review_text)
+        except json.JSONDecodeError:
+            # If AI didn't return valid JSON, wrap the text
+            review_data = {"raw_review": review_text}
+
+        usage_data = {
+            "api_calls": total_usage.api_calls,
+            "input_tokens": total_usage.input_tokens,
+            "output_tokens": total_usage.output_tokens,
+            "total_tokens": total_usage.input_tokens + total_usage.output_tokens,
+        }
+        if total_usage.cache_creation_tokens:
+            usage_data["cache_creation_tokens"] = total_usage.cache_creation_tokens
+        if total_usage.cache_read_tokens:
+            usage_data["cache_read_tokens"] = total_usage.cache_read_tokens
+        if args.show_costs:
+            usage_data["estimated_cost_usd"] = round(
+                estimate_cost(total_usage, args.provider, model), 6
+            )
+
+        output_data = {
+            "metadata": {
+                "patch_file": patch_name,
+                "provider": args.provider,
+                "provider_name": provider_name,
+                "model": model,
+                "review_date": review_date,
+                "target_release": args.release,
+                "is_lts": is_lts_release(args.release) if args.release else False,
+                "token_usage": usage_data,
+            },
+            "review": review_data,
+        }
+        output_text = json.dumps(output_data, indent=2)
+    elif args.output_format == "html":
+        # Wrap HTML content with header
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"<br>Target release: {args.release}{lts_badge}"
+        output_text = f"""<!-- AI-generated review of {patch_name} -->
+<!-- Reviewed using {provider_name} ({model}) on {review_date} -->
+<div class="patch-review">
+<h1>Patch Review: {patch_name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model}) on {review_date}{release_info}</p>
+{review_text}
+</div>
+"""
+    elif args.output_format == "markdown":
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"\n*Target release: {args.release}{lts_badge}*\n"
+        output_text = f"""# Patch Review: {patch_name}
+
+*Reviewed by {provider_name} ({model}) on {review_date}*
+{release_info}
+{review_text}
+"""
+    else:  # text
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+        output_text = f"=== Patch Review: {patch_name} (via {provider_name}) ===\n"
+        output_text += f"Review date: {review_date}\n"
+        output_text += release_info
+        output_text += "\n" + review_text
+
+    # Write output
+    if args.output:
+        Path(args.output).write_text(output_text)
+        print(f"Review written to: {args.output}", file=sys.stderr)
+    else:
+        print(output_text)
+
+    # Print token usage summary
+    if total_usage.api_calls > 0:
+        print("", file=sys.stderr)
+        print(
+            format_token_summary(
+                total_usage, args.provider, model, args.show_costs
+            ),
+            file=sys.stderr,
+        )
+
+    # Send email if requested
+    if args.send_email:
+        # Email always uses plain text - warn if different format requested
+        if args.output_format != "text":
+            print(
+                f"Note: Email will be sent as plain text regardless of "
+                f"--format={args.output_format}",
+                file=sys.stderr,
+            )
+
+        in_reply_to = get_last_message_id(patch_content)
+        orig_subject = get_last_subject(patch_content)
+
+        if orig_subject:
+            # Remove [PATCH n/m] prefix
+            review_subject = re.sub(r"^\[PATCH[^\]]*\]\s*", "", orig_subject)
+            review_subject = f"[REVIEW] {review_subject}"
+        else:
+            review_subject = f"[REVIEW] {patch_name}"
+
+        # Build email body - always use plain text version
+        release_info = ""
+        if args.release:
+            lts_badge = " (LTS)" if is_lts_release(args.release) else ""
+            release_info = f"Target release: {args.release}{lts_badge}\n"
+
+        email_body = f"""AI-generated review of {patch_name}
+Reviewed using {provider_name} ({model}) on {review_date}
+{release_info}
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+        if args.verbose:
+            print("", file=sys.stderr)
+            print("=== Email Details ===", file=sys.stderr)
+            print(f"Subject: {review_subject}", file=sys.stderr)
+            print(f"In-Reply-To: {in_reply_to}", file=sys.stderr)
+            print("=====================", file=sys.stderr)
+
+        send_email(
+            args.to_addrs,
+            args.cc_addrs,
+            from_addr,
+            review_subject,
+            in_reply_to,
+            email_body,
+            args.dry_run,
+        )
+
+        if not args.dry_run:
+            print("", file=sys.stderr)
+            print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v13 3/6] devtools: add compare-reviews.sh for multi-provider analysis
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
@ 2026-04-02 19:44     ` Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-02 19:44 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add script to run patch reviews across multiple AI providers for
comparison purposes.

The script automatically detects which providers have API keys
configured and runs analyze-patch.py for each one. This allows
users to compare review quality and feedback across different
AI models.

Features:
  - Auto-detects available providers based on environment variables
  - Optional provider selection via -p/--providers option
  - Saves individual reviews to separate files with -o/--output
  - Verbose mode passes through to underlying analyze-patch.py

Usage:
  ./devtools/compare-reviews.sh my-patch.patch
  ./devtools/compare-reviews.sh -p anthropic,xai my-patch.patch
  ./devtools/compare-reviews.sh -o ./reviews my-patch.patch

Output files are named <patch>-<provider>.txt when using the
output directory option.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/compare-reviews.sh | 263 ++++++++++++++++++++++++++++++++++++
 1 file changed, 263 insertions(+)
 create mode 100755 devtools/compare-reviews.sh

diff --git a/devtools/compare-reviews.sh b/devtools/compare-reviews.sh
new file mode 100755
index 0000000000..b4813cb6a7
--- /dev/null
+++ b/devtools/compare-reviews.sh
@@ -0,0 +1,263 @@
+#!/bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+# Compare DPDK patch reviews across multiple AI providers
+# Runs analyze-patch.py with each available provider
+
+set -o pipefail
+
+SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
+ANALYZE_SCRIPT="${SCRIPT_DIR}/analyze-patch.py"
+AGENTS_FILE="AGENTS.md"
+OUTPUT_DIR=""
+PROVIDERS=""
+FORMAT="text"
+VERBOSE=""
+EXTRA_ARGS=()
+
+usage() {
+    cat <<EOF
+Usage: $(basename "$0") [OPTIONS] <patch-file>
+
+Compare DPDK patch reviews across multiple AI providers.
+
+Options:
+    -a, --agents FILE      Path to AGENTS.md file (default: AGENTS.md)
+    -o, --output DIR       Save individual reviews to directory
+    -p, --providers LIST   Comma-separated list of providers to use
+                           (default: all providers with API keys set)
+    -f, --format FORMAT    Output format: text, markdown, html, json
+                           (default: text)
+    -t, --tokens N         Max tokens for response
+    -D, --date DATE        Review date context (YYYY-MM-DD)
+    -r, --release VERSION  Target DPDK release (e.g., 24.11, 23.11-lts)
+    --split-patches        Split mbox into individual patches
+    --patch-range N-M      Review only patches N through M
+    --large-file MODE      Handle large files: error, truncate, chunk,
+                           commits-only, summary
+    --max-tokens N         Max input tokens
+    -v, --verbose          Show verbose output from each provider
+    -h, --help             Show this help message
+
+Environment Variables:
+    Set API keys for providers you want to use:
+    ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY
+
+Examples:
+    $(basename "$0") my-patch.patch
+    $(basename "$0") -p anthropic,openai my-patch.patch
+    $(basename "$0") -o ./reviews -f markdown my-patch.patch
+    $(basename "$0") -r 24.11 --split-patches series.mbox
+EOF
+    exit "${1:-0}"
+}
+
+error() {
+    echo "Error: $1" >&2
+    exit 1
+}
+
+# Check which providers have API keys configured
+get_available_providers() {
+    local available=""
+
+    [[ -n "$ANTHROPIC_API_KEY" ]] && available="${available}anthropic,"
+    [[ -n "$OPENAI_API_KEY" ]] && available="${available}openai,"
+    [[ -n "$XAI_API_KEY" ]] && available="${available}xai,"
+    [[ -n "$GOOGLE_API_KEY" ]] && available="${available}google,"
+
+    # Remove trailing comma
+    echo "${available%,}"
+}
+
+# Get file extension for format
+get_extension() {
+    case "$1" in
+        text)     echo "txt" ;;
+        markdown) echo "md" ;;
+        html)     echo "html" ;;
+        json)     echo "json" ;;
+        *)        echo "txt" ;;
+    esac
+}
+
+# Parse command line options
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        -a|--agents)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            AGENTS_FILE="$2"
+            shift 2
+            ;;
+        -o|--output)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        -p|--providers)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            PROVIDERS="$2"
+            shift 2
+            ;;
+        -f|--format)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            FORMAT="$2"
+            shift 2
+            ;;
+        -t|--tokens)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            EXTRA_ARGS+=("-t" "$2")
+            shift 2
+            ;;
+        -D|--date)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            EXTRA_ARGS+=("-D" "$2")
+            shift 2
+            ;;
+        -r|--release)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            EXTRA_ARGS+=("-r" "$2")
+            shift 2
+            ;;
+        --split-patches)
+            EXTRA_ARGS+=("--split-patches")
+            shift
+            ;;
+        --patch-range)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            EXTRA_ARGS+=("--patch-range" "$2")
+            shift 2
+            ;;
+        --large-file)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            EXTRA_ARGS+=("--large-file" "$2")
+            shift 2
+            ;;
+        --large-file=*)
+            EXTRA_ARGS+=("$1")
+            shift
+            ;;
+        --max-tokens)
+            [[ -z "${2:-}" || "$2" == -* ]] && error "$1 requires an argument"
+            EXTRA_ARGS+=("--max-tokens" "$2")
+            shift 2
+            ;;
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -h|--help)
+            usage 0
+            ;;
+        -*)
+            error "Unknown option: $1"
+            ;;
+        *)
+            break
+            ;;
+    esac
+done
+
+# Check for required arguments
+if [[ $# -lt 1 ]]; then
+    echo "Error: No patch file specified" >&2
+    usage 1
+fi
+
+PATCH_FILE="$1"
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    error "Patch file not found: $PATCH_FILE"
+fi
+
+if [[ ! -f "$ANALYZE_SCRIPT" ]]; then
+    error "analyze-patch.py not found: $ANALYZE_SCRIPT"
+fi
+
+if [[ ! -f "$AGENTS_FILE" ]]; then
+    error "AGENTS.md not found: $AGENTS_FILE"
+fi
+
+# Validate format
+case "$FORMAT" in
+    text|markdown|html|json) ;;
+    *) error "Invalid format: $FORMAT (must be text, markdown, html, or json)" ;;
+esac
+
+# Get providers to use
+if [[ -z "$PROVIDERS" ]]; then
+    PROVIDERS=$(get_available_providers)
+fi
+
+if [[ -z "$PROVIDERS" ]]; then
+    error "No API keys configured. Set at least one of: "\
+"ANTHROPIC_API_KEY, OPENAI_API_KEY, XAI_API_KEY, GOOGLE_API_KEY"
+fi
+
+# Create output directory if specified
+if [[ -n "$OUTPUT_DIR" ]]; then
+    mkdir -p "$OUTPUT_DIR"
+fi
+
+PATCH_BASENAME=$(basename "$PATCH_FILE")
+PATCH_STEM="${PATCH_BASENAME%.*}"
+EXT=$(get_extension "$FORMAT")
+
+echo "Reviewing patch: $PATCH_BASENAME"
+echo "Providers: $PROVIDERS"
+echo "Format: $FORMAT"
+echo "========================================"
+echo ""
+
+# Run review for each provider, continue on failure
+IFS=',' read -ra PROVIDER_LIST <<< "$PROVIDERS"
+failures=0
+for provider in "${PROVIDER_LIST[@]}"; do
+    echo ">>> Running review with: $provider"
+    echo ""
+
+    if [[ -n "$OUTPUT_DIR" ]]; then
+        OUTPUT_FILE="${OUTPUT_DIR}/${PATCH_STEM}-${provider}.${EXT}"
+        if python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            ${VERBOSE:+"$VERBOSE"} \
+            "${EXTRA_ARGS[@]}" \
+            "$PATCH_FILE" | tee "$OUTPUT_FILE"; then
+            echo ""
+            echo "Saved to: $OUTPUT_FILE"
+        else
+            echo "FAILED: $provider review failed" >&2
+            rm -f "$OUTPUT_FILE"
+            ((failures++)) || true
+        fi
+    else
+        if ! python3 "$ANALYZE_SCRIPT" \
+            -p "$provider" \
+            -a "$AGENTS_FILE" \
+            -f "$FORMAT" \
+            ${VERBOSE:+"$VERBOSE"} \
+            "${EXTRA_ARGS[@]}" \
+            "$PATCH_FILE"; then
+            echo "FAILED: $provider review failed" >&2
+            ((failures++)) || true
+        fi
+    fi
+
+    echo ""
+    echo "========================================"
+    echo ""
+done
+
+echo "Review comparison complete."
+
+if [[ -n "$OUTPUT_DIR" ]]; then
+    echo "All reviews saved to: $OUTPUT_DIR"
+fi
+
+if [[ $failures -gt 0 ]]; then
+    echo "$failures provider(s) failed." >&2
+    exit 1
+fi
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v13 4/6] devtools: add multi-provider AI documentation review script
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (2 preceding siblings ...)
  2026-04-02 19:44     ` [PATCH v13 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
@ 2026-04-02 19:44     ` Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-02 19:44 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aaron Conole

Add review-doc.py script that reviews DPDK documentation files for
spelling, grammar, technical correctness, and clarity using AI
language models. Supports batch processing of multiple files.

Supported AI providers:
  - Anthropic Claude (default)
  - OpenAI ChatGPT
  - xAI Grok
  - Google Gemini

Output formats (-f/--format):
  - text: plain text with extractable diff/msg markers (default)
  - markdown: formatted review document
  - html: complete HTML document with styling
  - json: structured data with metadata

For each input file, the script produces:
  - <basename>.{txt,md,html,json}: review in selected format
  - <basename>.diff: unified diff (text/json, or with -d flag)
  - <basename>.msg: commit message (text/json, or with -d flag)

The commit message prefix is automatically determined from the
file path (e.g., doc/guides/prog_guide: for programmer's guide).

Features:
  - Multiple file processing with glob support
  - Provider selection via -p/--provider option
  - Custom model selection via -m/--model option
  - Configurable output directory via -o/--output-dir option
  - Output format selection via -f/--format option
  - Force diff/msg generation via -d/--diff option
  - Quiet mode (-q) suppresses stdout output
  - Verbose mode (-v) shows token usage and API details
  - Email integration using git sendemail configuration
  - Prompt caching support for Anthropic to reduce costs

Usage:
  ./devtools/review-doc.py doc/guides/prog_guide/mempool_lib.rst
  ./devtools/review-doc.py doc/guides/nics/*.rst
  ./devtools/review-doc.py -f html -d -o /tmp doc/guides/nics/*.rst
  ./devtools/review-doc.py --send-email --to dev@dpdk.org file.rst

Requires the appropriate API key environment variable to be set
for the chosen provider (ANTHROPIC_API_KEY, OPENAI_API_KEY,
XAI_API_KEY, or GOOGLE_API_KEY).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/review-doc.py | 1341 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1341 insertions(+)
 create mode 100755 devtools/review-doc.py

diff --git a/devtools/review-doc.py b/devtools/review-doc.py
new file mode 100755
index 0000000000..901a2d9f42
--- /dev/null
+++ b/devtools/review-doc.py
@@ -0,0 +1,1341 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+"""
+Review DPDK documentation files using AI providers.
+
+Produces a diff file and commit message compliant with DPDK standards.
+Accepts multiple documentation files and generates output for each.
+Supported providers: Anthropic Claude, OpenAI ChatGPT, xAI Grok, Google Gemini
+"""
+
+import argparse
+import getpass
+import json
+import os
+import re
+import smtplib
+import ssl
+import subprocess
+import sys
+import time
+from dataclasses import dataclass
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any, NoReturn
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+# Map output format to file extension
+FORMAT_EXTENSIONS = {
+    "text": ".txt",
+    "markdown": ".md",
+    "html": ".html",
+    "json": ".json",
+}
+
+# Additional markers for extracting diff/msg (used with --diff flag)
+DIFF_MARKERS_INSTRUCTION = """
+
+ADDITIONALLY, at the end of your response, include these exact markers for automated extraction:
+---COMMIT_MESSAGE_START---
+(same commit message as above)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(same unified diff as above)
+---UNIFIED_DIFF_END---
+"""
+
+# Provider configurations
+PROVIDERS = {
+    "anthropic": {
+        "name": "Claude",
+        "endpoint": "https://api.anthropic.com/v1/messages",
+        "default_model": "claude-sonnet-4-5-20250929",
+        "env_var": "ANTHROPIC_API_KEY",
+    },
+    "openai": {
+        "name": "ChatGPT",
+        "endpoint": "https://api.openai.com/v1/chat/completions",
+        "default_model": "gpt-4.1",
+        "env_var": "OPENAI_API_KEY",
+    },
+    "xai": {
+        "name": "Grok",
+        "endpoint": "https://api.x.ai/v1/chat/completions",
+        "default_model": "grok-4-1-fast-non-reasoning",
+        "env_var": "XAI_API_KEY",
+    },
+    "google": {
+        "name": "Gemini",
+        "endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
+        "default_model": "gemini-3-flash-preview",
+        "env_var": "GOOGLE_API_KEY",
+    },
+}
+
+
+@dataclass
+class TokenUsage:
+    """Accumulated token usage across API calls."""
+
+    input_tokens: int = 0
+    output_tokens: int = 0
+    cache_creation_tokens: int = 0
+    cache_read_tokens: int = 0
+    api_calls: int = 0
+
+    def add(self, other: "TokenUsage") -> None:
+        """Accumulate usage from another TokenUsage."""
+        self.input_tokens += other.input_tokens
+        self.output_tokens += other.output_tokens
+        self.cache_creation_tokens += other.cache_creation_tokens
+        self.cache_read_tokens += other.cache_read_tokens
+        self.api_calls += other.api_calls
+
+
+# Pricing per million tokens (USD) - update as prices change.
+# Keys are (provider, model-prefix) tuples; first prefix match wins.
+# "default" key is fallback for unknown models within a provider.
+PRICING: dict[str, dict[str, dict[str, float]]] = {
+    "anthropic": {
+        "claude-opus-4": {
+            "input": 15.0, "output": 75.0,
+            "cache_write": 18.75, "cache_read": 1.50,
+        },
+        "claude-sonnet-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+        "claude-haiku-4": {
+            "input": 0.80, "output": 4.0,
+            "cache_write": 1.0, "cache_read": 0.08,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.75, "cache_read": 0.30,
+        },
+    },
+    "openai": {
+        "gpt-4.1": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+        "gpt-4.1-mini": {
+            "input": 0.40, "output": 1.60,
+            "cache_write": 0.40, "cache_read": 0.10,
+        },
+        "gpt-4.1-nano": {
+            "input": 0.10, "output": 0.40,
+            "cache_write": 0.10, "cache_read": 0.025,
+        },
+        "default": {
+            "input": 2.0, "output": 8.0,
+            "cache_write": 2.0, "cache_read": 0.50,
+        },
+    },
+    "xai": {
+        "grok-4": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+        "default": {
+            "input": 3.0, "output": 15.0,
+            "cache_write": 3.0, "cache_read": 0.75,
+        },
+    },
+    "google": {
+        "gemini-3-flash": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+        "default": {
+            "input": 0.15, "output": 0.60,
+            "cache_write": 0.15, "cache_read": 0.0375,
+        },
+    },
+}
+
+
+def get_pricing(provider: str, model: str) -> dict[str, float]:
+    """Look up per-million-token pricing for a provider/model."""
+    provider_prices = PRICING.get(provider, {})
+    # Sort by prefix length descending so longer prefixes match first
+    # (e.g., "gpt-4.1-mini" before "gpt-4.1")
+    for prefix, prices in sorted(
+        provider_prices.items(), key=lambda x: len(x[0]), reverse=True
+    ):
+        if prefix != "default" and model.startswith(prefix):
+            return prices
+    return provider_prices.get(
+        "default", {"input": 0, "output": 0, "cache_write": 0, "cache_read": 0}
+    )
+
+
+def estimate_cost(usage: TokenUsage, provider: str, model: str) -> float:
+    """Estimate cost in USD from token usage."""
+    prices = get_pricing(provider, model)
+    cost = 0.0
+    # Non-cached input tokens = total input - cache_read
+    regular_input = max(0, usage.input_tokens - usage.cache_read_tokens)
+    cost += regular_input * prices.get("input", 0) / 1_000_000
+    cost += usage.output_tokens * prices.get("output", 0) / 1_000_000
+    cost += usage.cache_creation_tokens * prices.get("cache_write", 0) / 1_000_000
+    cost += usage.cache_read_tokens * prices.get("cache_read", 0) / 1_000_000
+    return cost
+
+
+def format_token_summary(
+    usage: TokenUsage, provider: str, model: str, show_costs: bool
+) -> str:
+    """Format a token usage summary string."""
+    lines = ["=== Token Usage Summary ==="]
+    lines.append(f"API calls:     {usage.api_calls}")
+    lines.append(f"Input tokens:  {usage.input_tokens:,}")
+    lines.append(f"Output tokens: {usage.output_tokens:,}")
+    if usage.cache_creation_tokens:
+        lines.append(f"Cache write:   {usage.cache_creation_tokens:,}")
+    if usage.cache_read_tokens:
+        lines.append(f"Cache read:    {usage.cache_read_tokens:,}")
+    total = usage.input_tokens + usage.output_tokens
+    lines.append(f"Total tokens:  {total:,}")
+    if show_costs:
+        cost = estimate_cost(usage, provider, model)
+        lines.append(f"Est. cost:     ${cost:.4f}")
+    lines.append("=" * 27)
+    return "\n".join(lines)
+
+
+# Commit prefix mappings based on file path
+COMMIT_PREFIX_MAP = [
+    ("doc/guides/prog_guide/", "doc/guides/prog_guide:"),
+    ("doc/guides/sample_app_ug/", "doc/guides/sample_app:"),
+    ("doc/guides/nics/", "doc/guides/nics:"),
+    ("doc/guides/cryptodevs/", "doc/guides/cryptodevs:"),
+    ("doc/guides/compressdevs/", "doc/guides/compressdevs:"),
+    ("doc/guides/eventdevs/", "doc/guides/eventdevs:"),
+    ("doc/guides/rawdevs/", "doc/guides/rawdevs:"),
+    ("doc/guides/bbdevs/", "doc/guides/bbdevs:"),
+    ("doc/guides/gpus/", "doc/guides/gpus:"),
+    ("doc/guides/dmadevs/", "doc/guides/dmadevs:"),
+    ("doc/guides/regexdevs/", "doc/guides/regexdevs:"),
+    ("doc/guides/mldevs/", "doc/guides/mldevs:"),
+    ("doc/guides/rel_notes/", "doc/guides/rel_notes:"),
+    ("doc/guides/linux_gsg/", "doc/guides/linux_gsg:"),
+    ("doc/guides/freebsd_gsg/", "doc/guides/freebsd_gsg:"),
+    ("doc/guides/windows_gsg/", "doc/guides/windows_gsg:"),
+    ("doc/guides/tools/", "doc/guides/tools:"),
+    ("doc/guides/testpmd_app_ug/", "doc/guides/testpmd:"),
+    ("doc/guides/howto/", "doc/guides/howto:"),
+    ("doc/guides/contributing/", "doc/guides/contributing:"),
+    ("doc/guides/platform/", "doc/guides/platform:"),
+    ("doc/guides/", "doc:"),
+    ("doc/api/", "doc/api:"),
+    ("doc/", "doc:"),
+]
+
+SYSTEM_PROMPT = """\
+You are an expert technical documentation reviewer for DPDK.
+Your task is to review documentation files and suggest improvements for:
+- Spelling errors
+- Grammar issues
+- Technical correctness
+- Clarity and readability
+- Consistency with DPDK terminology
+
+IMPORTANT COMMIT MESSAGE RULES (from check-git-log.sh):
+- Subject line MUST be ≤60 characters
+- Format: "prefix: lowercase description"
+- First word after colon must be lowercase (except acronyms like Rx, Tx, VF, MAC, API)
+- Use imperative mood (e.g., "fix typo" not "fixed typo" or "fixes typo")
+- NO trailing period on subject line
+- NO punctuation marks: , ; ! ? & |
+- NO underscores in subject after colon
+- Body lines wrapped at 75 characters
+- Body must NOT start with "It"
+- Do NOT include Signed-off-by (user adds via git commit --sign)
+- Only use "Fixes:" tag for actual errors in documentation, not style improvements
+
+Case-sensitive terms (must use exact case):
+- Rx, Tx (not RX, TX, rx, tx)
+- VF, PF (not vf, pf)
+- MAC, VLAN, RSS, API
+- Linux, Windows, FreeBSD
+
+For style/clarity improvements, do NOT use Fixes tag.
+For actual errors (wrong information, broken examples), include Fixes tag \
+if you can identify the commit."""
+
+FORMAT_INSTRUCTIONS = {
+    "text": """
+OUTPUT FORMAT:
+You must output exactly two sections:
+
+1. COMMIT_MESSAGE section containing the complete commit message
+2. UNIFIED_DIFF section containing the unified diff
+
+Use these exact markers:
+---COMMIT_MESSAGE_START---
+(commit message here)
+---COMMIT_MESSAGE_END---
+
+---UNIFIED_DIFF_START---
+(unified diff here)
+---UNIFIED_DIFF_END---
+
+The diff should be in unified format that can be applied with "git apply".
+If no changes are needed, output empty sections with a note.""",
+    "markdown": """
+OUTPUT FORMAT:
+Provide your review in Markdown format with:
+
+## Summary
+Brief description of changes
+
+## Commit Message
+```
+(complete commit message here, ready to use)
+```
+
+## Changes
+For each change:
+### Issue N: Brief title
+- **Location**: file path and line
+- **Problem**: description
+- **Fix**: suggested correction
+
+## Unified Diff
+```diff
+(unified diff here)
+```""",
+    "html": """
+OUTPUT FORMAT:
+Provide your review in HTML format with:
+- <h2> for sections (Summary, Commit Message, Changes, Diff)
+- <pre><code> for commit message and diff
+- <ul>/<li> for individual issues
+- Do NOT include <html>, <head>, or <body> tags - just the content
+
+Include sections for: Summary, Commit Message, Changes, Unified Diff""",
+    "json": """
+OUTPUT FORMAT:
+Provide your review as JSON with this structure:
+{
+  "summary": "Brief description of changes",
+  "commit_message": "Complete commit message ready to use",
+  "changes": [
+    {
+      "type": "spelling|grammar|technical|clarity|style",
+      "location": "line number or section",
+      "original": "original text",
+      "suggested": "corrected text",
+      "reason": "why this change"
+    }
+  ],
+  "diff": "unified diff as a string",
+  "stats": {
+    "total_issues": 0,
+    "spelling": 0,
+    "grammar": 0,
+    "technical": 0,
+    "clarity": 0
+  }
+}
+Output ONLY valid JSON, no markdown code fences or other text.""",
+}
+
+USER_PROMPT = """\
+Review the following DPDK documentation file and provide improvements.
+
+File path: {doc_file}
+Commit message prefix to use: {commit_prefix}
+
+{format_instruction}
+
+---DOCUMENT CONTENT---
+"""
+
+
+def error(msg: str) -> NoReturn:
+    """Print error message and exit."""
+    print(f"Error: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def get_git_config(key: str) -> str | None:
+    """Get a value from git config."""
+    try:
+        result = subprocess.run(
+            ["git", "config", "--get", key],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_smtp_config() -> dict[str, Any]:
+    """Get SMTP configuration from git config sendemail settings."""
+    config = {
+        "server": get_git_config("sendemail.smtpserver"),
+        "port": get_git_config("sendemail.smtpserverport"),
+        "user": get_git_config("sendemail.smtpuser"),
+        "encryption": get_git_config("sendemail.smtpencryption"),
+        "password": get_git_config("sendemail.smtppass"),
+    }
+
+    # Set defaults
+    if not config["port"]:
+        if config["encryption"] == "ssl":
+            config["port"] = "465"
+        else:
+            config["port"] = "587"
+
+    # Convert port to int
+    try:
+        config["port"] = int(config["port"])
+    except (ValueError, TypeError):
+        error(f"Invalid SMTP port in git config: {config['port']}")
+
+    return config
+
+
+def get_commit_prefix(filepath: str) -> str:
+    """Determine commit message prefix from file path."""
+    for prefix_path, prefix in COMMIT_PREFIX_MAP:
+        if filepath.startswith(prefix_path):
+            return prefix
+    return "doc:"
+
+
+def build_user_prompt(
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> str:
+    """Build the user prompt with format instructions."""
+    format_instruction = FORMAT_INSTRUCTIONS.get(output_format, "")
+    if include_diff_markers and output_format not in ("text", "json"):
+        format_instruction += DIFF_MARKERS_INSTRUCTION
+    return USER_PROMPT.format(
+        doc_file=doc_file,
+        commit_prefix=commit_prefix,
+        format_instruction=format_instruction,
+    )
+
+
+def build_anthropic_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Anthropic API."""
+    user_prompt = build_user_prompt(
+        doc_file, commit_prefix, output_format, include_diff_markers
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "system": [
+            {"type": "text", "text": SYSTEM_PROMPT},
+            {
+                "type": "text",
+                "text": agents_content,
+                "cache_control": {"type": "ephemeral"},
+            },
+        ],
+        "messages": [
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            }
+        ],
+    }
+
+
+def build_openai_request(
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for OpenAI-compatible APIs."""
+    user_prompt = build_user_prompt(
+        doc_file, commit_prefix, output_format, include_diff_markers
+    )
+    return {
+        "model": model,
+        "max_tokens": max_tokens,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "system", "content": agents_content},
+            {
+                "role": "user",
+                "content": user_prompt + doc_content,
+            },
+        ],
+    }
+
+
+def build_google_request(
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+) -> dict[str, Any]:
+    """Build request payload for Google Gemini API."""
+    user_prompt = build_user_prompt(
+        doc_file, commit_prefix, output_format, include_diff_markers
+    )
+    return {
+        "systemInstruction": {
+            "parts": [
+                {"text": SYSTEM_PROMPT},
+                {"text": agents_content},
+            ],
+        },
+        "contents": [
+            {
+                "role": "user",
+                "parts": [{"text": user_prompt + doc_content}],
+            },
+        ],
+        "generationConfig": {"maxOutputTokens": max_tokens},
+    }
+
+
+def call_api(
+    provider: str,
+    api_key: str,
+    model: str,
+    max_tokens: int,
+    agents_content: str,
+    doc_content: str,
+    doc_file: str,
+    commit_prefix: str,
+    output_format: str = "text",
+    include_diff_markers: bool = False,
+    verbose: bool = False,
+    timeout: int = 120,
+) -> tuple[str, TokenUsage]:
+    """Make API request to the specified provider.
+
+    Returns a tuple of (response_text, token_usage).
+    """
+    config = PROVIDERS[provider]
+
+    # Build request based on provider
+    if provider == "anthropic":
+        request_data = build_anthropic_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-api-key": api_key,
+            "anthropic-version": "2023-06-01",
+        }
+        url = config["endpoint"]
+    elif provider == "google":
+        request_data = build_google_request(
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "x-goog-api-key": api_key,
+        }
+        url = f"{config['endpoint']}/{model}:generateContent"
+    else:  # openai, xai
+        request_data = build_openai_request(
+            model,
+            max_tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            output_format,
+            include_diff_markers,
+        )
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {api_key}",
+        }
+        url = config["endpoint"]
+
+    # Make request with retries for transient errors
+    request_body = json.dumps(request_data).encode("utf-8")
+    max_retries = 3
+    result = None
+
+    for attempt in range(max_retries + 1):
+        req = Request(url, data=request_body, headers=headers, method="POST")
+        try:
+            with urlopen(req, timeout=timeout) as response:
+                result = json.loads(response.read().decode("utf-8"))
+            break  # Success
+        except HTTPError as e:
+            if e.code in (429, 503, 529) and attempt < max_retries:
+                delay = 2 ** (attempt + 1)  # 2, 4, 8 seconds
+                # Check for Retry-After header
+                retry_after = e.headers.get("Retry-After")
+                if retry_after:
+                    try:
+                        delay = max(delay, int(retry_after))
+                    except ValueError:
+                        pass
+                print(
+                    f"API returned {e.code}, retrying in {delay}s "
+                    f"(attempt {attempt + 1}/{max_retries})...",
+                    file=sys.stderr,
+                )
+                e.read()  # Drain the response body
+                time.sleep(delay)
+                continue
+            error_body = e.read().decode("utf-8")
+            try:
+                error_data = json.loads(error_body)
+                error(f"API error: {error_data.get('error', error_body)}")
+            except json.JSONDecodeError:
+                error(f"API error ({e.code}): {error_body}")
+        except URLError as e:
+            if isinstance(e.reason, TimeoutError):
+                error(f"Request timed out after {timeout} seconds")
+            error(f"Connection error: {e.reason}")
+        except TimeoutError:
+            error(f"Request timed out after {timeout} seconds")
+
+    if result is None:
+        error("API request failed after all retries")
+
+    # Extract token usage
+    usage = TokenUsage(api_calls=1)
+    if provider == "anthropic":
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("input_tokens", 0)
+        usage.output_tokens = raw_usage.get("output_tokens", 0)
+        usage.cache_creation_tokens = raw_usage.get(
+            "cache_creation_input_tokens", 0
+        )
+        usage.cache_read_tokens = raw_usage.get("cache_read_input_tokens", 0)
+    elif provider == "google":
+        raw_usage = result.get("usageMetadata", {})
+        usage.input_tokens = raw_usage.get("promptTokenCount", 0)
+        usage.output_tokens = raw_usage.get("candidatesTokenCount", 0)
+    else:  # openai, xai
+        raw_usage = result.get("usage", {})
+        usage.input_tokens = raw_usage.get("prompt_tokens", 0)
+        usage.output_tokens = raw_usage.get("completion_tokens", 0)
+        # OpenAI cache details (if available)
+        cache_details = raw_usage.get("prompt_tokens_details", {})
+        if cache_details:
+            usage.cache_read_tokens = cache_details.get("cached_tokens", 0)
+
+    # Show per-call details in verbose mode
+    if verbose:
+        print("=== Token Usage ===", file=sys.stderr)
+        print(f"Input tokens: {usage.input_tokens:,}", file=sys.stderr)
+        print(f"Output tokens: {usage.output_tokens:,}", file=sys.stderr)
+        if usage.cache_creation_tokens:
+            print(
+                f"Cache creation: {usage.cache_creation_tokens:,}",
+                file=sys.stderr,
+            )
+        if usage.cache_read_tokens:
+            print(
+                f"Cache read: {usage.cache_read_tokens:,}",
+                file=sys.stderr,
+            )
+        print("===================", file=sys.stderr)
+
+    # Extract response text
+    if provider == "anthropic":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        content = result.get("content", [])
+        text = "".join(
+            block.get("text", "") for block in content if block.get("type") == "text"
+        )
+        return text, usage
+    elif provider == "google":
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        candidates = result.get("candidates", [])
+        if not candidates:
+            error("No response from Gemini")
+        parts = candidates[0].get("content", {}).get("parts", [])
+        text = "".join(part.get("text", "") for part in parts)
+        return text, usage
+    else:  # openai, xai
+        if "error" in result:
+            error(f"API error: {result['error'].get('message', result)}")
+        choices = result.get("choices", [])
+        if not choices:
+            error("No response from API")
+        text = choices[0].get("message", {}).get("content", "")
+        return text, usage
+
+
+def parse_review_text(review_text: str) -> tuple[str, str]:
+    """Extract commit message and diff from text format response."""
+    commit_msg = ""
+    diff = ""
+
+    # Extract commit message
+    msg_match = re.search(
+        r"---COMMIT_MESSAGE_START---\s*\n(.*?)\n---COMMIT_MESSAGE_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if msg_match:
+        commit_msg = msg_match.group(1).strip()
+
+    # Extract unified diff
+    diff_match = re.search(
+        r"---UNIFIED_DIFF_START---\s*\n(.*?)\n---UNIFIED_DIFF_END---",
+        review_text,
+        re.DOTALL,
+    )
+    if diff_match:
+        diff = diff_match.group(1).strip()
+        # Clean up any markdown code fence if present
+        diff = re.sub(r"^```diff\s*\n?", "", diff)
+        diff = re.sub(r"\n?```\s*$", "", diff)
+
+    return commit_msg, diff
+
+
+def strip_diff_markers(text: str) -> str:
+    """Remove the diff/msg extraction markers from text."""
+    # Remove commit message markers and content
+    text = re.sub(
+        r"\n*---COMMIT_MESSAGE_START---\s*\n.*?\n---COMMIT_MESSAGE_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    # Remove unified diff markers and content
+    text = re.sub(
+        r"\n*---UNIFIED_DIFF_START---\s*\n.*?\n---UNIFIED_DIFF_END---\s*",
+        "",
+        text,
+        flags=re.DOTALL,
+    )
+    return text.strip()
+
+
+def send_email(
+    to_addrs: list[str],
+    cc_addrs: list[str],
+    from_addr: str,
+    subject: str,
+    in_reply_to: str | None,
+    body: str,
+    dry_run: bool = False,
+    verbose: bool = False,
+) -> None:
+    """Send review email via SMTP using git sendemail config."""
+    # Build email message
+    msg = EmailMessage()
+    msg["From"] = from_addr
+    msg["To"] = ", ".join(to_addrs)
+    if cc_addrs:
+        msg["Cc"] = ", ".join(cc_addrs)
+    msg["Subject"] = subject
+    if in_reply_to:
+        msg["In-Reply-To"] = in_reply_to
+        msg["References"] = in_reply_to
+    msg.set_content(body)
+
+    if dry_run:
+        print("=== Email Preview (dry-run) ===", file=sys.stderr)
+        print(msg.as_string(), file=sys.stderr)
+        print("=== End Preview ===", file=sys.stderr)
+        return
+
+    # Get SMTP configuration from git config
+    smtp_config = get_smtp_config()
+
+    if not smtp_config["server"]:
+        error("No SMTP server configured. Set git config sendemail.smtpserver")
+
+    server = smtp_config["server"]
+    port = smtp_config["port"]
+    user = smtp_config["user"]
+    encryption = smtp_config["encryption"]
+
+    # Get password from environment or git config, or prompt
+    password = os.environ.get("SMTP_PASSWORD") or smtp_config["password"]
+    if user and not password:
+        password = getpass.getpass(f"SMTP password for {user}@{server}: ")
+
+    if verbose:
+        print(f"SMTP server: {server}:{port}", file=sys.stderr)
+        print(f"SMTP user: {user or '(none)'}", file=sys.stderr)
+        print(f"Encryption: {encryption or 'starttls'}", file=sys.stderr)
+
+    # Collect all recipients
+    all_recipients = list(to_addrs)
+    if cc_addrs:
+        all_recipients.extend(cc_addrs)
+
+    try:
+        if encryption == "ssl":
+            # SSL/TLS connection from the start (port 465)
+            context = ssl.create_default_context()
+            with smtplib.SMTP_SSL(server, port, context=context) as smtp:
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+        else:
+            # STARTTLS (port 587) or plain (port 25)
+            with smtplib.SMTP(server, port) as smtp:
+                smtp.ehlo()
+                if encryption == "tls" or port == 587:
+                    context = ssl.create_default_context()
+                    smtp.starttls(context=context)
+                    smtp.ehlo()
+                if user and password:
+                    smtp.login(user, password)
+                smtp.send_message(msg, from_addr, all_recipients)
+
+        print(f"Email sent via SMTP ({server}:{port})", file=sys.stderr)
+
+    except smtplib.SMTPAuthenticationError as e:
+        error(f"SMTP authentication failed: {e}")
+    except smtplib.SMTPException as e:
+        error(f"SMTP error: {e}")
+    except OSError as e:
+        error(f"Connection error to {server}:{port}: {e}")
+
+
+def list_providers() -> None:
+    """Print available providers and exit."""
+    print("Available AI Providers:\n")
+    print(f"{'Provider':<12} {'Default Model':<30} {'API Key Variable'}")
+    print(f"{'--------':<12} {'-------------':<30} {'----------------'}")
+    for name, config in PROVIDERS.items():
+        print(f"{name:<12} {config['default_model']:<30} {config['env_var']}")
+    sys.exit(0)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Review DPDK documentation files using AI providers. "
+        "Accepts multiple files and generates output for each.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    %(prog)s doc/guides/prog_guide/mempool_lib.rst
+    %(prog)s doc/guides/nics/*.rst              # Review all NIC docs
+    %(prog)s -p openai -o /tmp doc/guides/nics/ixgbe.rst doc/guides/nics/i40e.rst
+    %(prog)s -f html -d -o /tmp/reviews doc/guides/nics/*.rst  # HTML + diff files
+    %(prog)s -f json -o /tmp doc/guides/howto/flow_bifurcation.rst
+    %(prog)s --send-email --to dev@dpdk.org doc/guides/nics/ixgbe.rst
+
+Output files (in output-dir):
+    <basename>.txt|.md|.html|.json  Review in selected format
+    <basename>.diff                  Unified diff (text/json, or with --diff)
+    <basename>.msg                   Commit message (text/json, or with --diff)
+
+After review:
+    git apply <basename>.diff
+    git commit -sF <basename>.msg
+
+SMTP Configuration (from git config):
+    sendemail.smtpserver      SMTP server hostname
+    sendemail.smtpserverport  SMTP port (default: 587 for TLS, 465 for SSL)
+    sendemail.smtpuser        SMTP username
+    sendemail.smtpencryption  'tls' for STARTTLS, 'ssl' for SSL/TLS
+    sendemail.smtppass        SMTP password (or set SMTP_PASSWORD env var)
+
+Example git config:
+    git config --global sendemail.smtpserver smtp.gmail.com
+    git config --global sendemail.smtpserverport 587
+    git config --global sendemail.smtpuser yourname@gmail.com
+    git config --global sendemail.smtpencryption tls
+
+Token Usage:
+    Token counts are always printed to stderr after each run.
+    %(prog)s -c doc/guides/nics/ixgbe.rst    # Include estimated cost
+    %(prog)s -c -f json doc/guides/nics/*.rst # Cost in JSON metadata too
+        """,
+    )
+
+    parser.add_argument(
+        "doc_files",
+        nargs="*",
+        metavar="doc_file",
+        help="Documentation file(s) to review",
+    )
+    parser.add_argument(
+        "-p",
+        "--provider",
+        choices=PROVIDERS.keys(),
+        default="anthropic",
+        help="AI provider (default: anthropic)",
+    )
+    parser.add_argument(
+        "-a",
+        "--agents",
+        default="AGENTS.md",
+        help="Path to AGENTS.md file (default: AGENTS.md)",
+    )
+    parser.add_argument(
+        "-m",
+        "--model",
+        help="Model to use (default: provider-specific)",
+    )
+    parser.add_argument(
+        "-t",
+        "--tokens",
+        type=int,
+        default=8192,
+        help="Max tokens for response (default: 8192)",
+    )
+    parser.add_argument(
+        "-o",
+        "--output-dir",
+        default=".",
+        help="Output directory for all output files (default: .)",
+    )
+    parser.add_argument(
+        "-v",
+        "--verbose",
+        action="store_true",
+        help="Show API request details",
+    )
+    parser.add_argument(
+        "-q",
+        "--quiet",
+        action="store_true",
+        help="Suppress review output to stdout (only write files)",
+    )
+    parser.add_argument(
+        "-f",
+        "--format",
+        choices=FORMAT_EXTENSIONS,
+        default="text",
+        dest="output_format",
+        help="Output format: text, markdown, html, json (default: text)",
+    )
+    parser.add_argument(
+        "-d",
+        "--diff",
+        action="store_true",
+        help="Always produce .diff and .msg files (automatic for text/json)",
+    )
+    parser.add_argument(
+        "-l",
+        "--list-providers",
+        action="store_true",
+        help="List available providers and exit",
+    )
+    parser.add_argument(
+        "-c",
+        "--show-costs",
+        action="store_true",
+        help="Show estimated cost alongside token usage summary",
+    )
+    parser.add_argument(
+        "--timeout",
+        type=int,
+        default=120,
+        metavar="SECONDS",
+        help="API request timeout in seconds (default: 120)",
+    )
+
+    # Email options
+    email_group = parser.add_argument_group("Email Options")
+    email_group.add_argument(
+        "--send-email",
+        action="store_true",
+        help="Send review via email",
+    )
+    email_group.add_argument(
+        "--to",
+        action="append",
+        dest="to_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="Email recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--cc",
+        action="append",
+        dest="cc_addrs",
+        default=[],
+        metavar="ADDRESS",
+        help="CC recipient (can be specified multiple times)",
+    )
+    email_group.add_argument(
+        "--from",
+        dest="from_addr",
+        metavar="ADDRESS",
+        help="From address (default: from git config)",
+    )
+    email_group.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Show email without sending",
+    )
+
+    args = parser.parse_args()
+
+    if args.list_providers:
+        list_providers()
+
+    # Check doc files provided (after list_providers so that flag works alone)
+    if not args.doc_files:
+        parser.error("at least one doc_file is required")
+
+    # Get provider config
+    config = PROVIDERS[args.provider]
+    model = args.model or config["default_model"]
+
+    # Get API key
+    api_key = os.environ.get(config["env_var"])
+    if not api_key:
+        error(f"{config['env_var']} environment variable not set")
+
+    # Validate files
+    agents_path = Path(args.agents)
+    if not agents_path.exists():
+        error(f"AGENTS.md not found: {args.agents}")
+
+    # Validate all doc files exist before processing
+    doc_paths = []
+    for doc_file in args.doc_files:
+        doc_path = Path(doc_file)
+        if not doc_path.exists():
+            error(f"Documentation file not found: {doc_file}")
+        doc_paths.append((doc_file, doc_path))
+
+    # Validate email options
+    if args.send_email and not args.to_addrs:
+        error("--send-email requires at least one --to address")
+
+    # Get from address for email
+    from_addr = args.from_addr
+    if args.send_email and not from_addr:
+        git_name = get_git_config("user.name")
+        git_email = get_git_config("user.email")
+        if git_email:
+            from_addr = f"{git_name} <{git_email}>" if git_name else git_email
+        else:
+            error("No --from specified and git user.email not configured")
+
+    # Read AGENTS.md once
+    agents_content = agents_path.read_text()
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    provider_name = config["name"]
+
+    # Detect duplicate stems to disambiguate output filenames
+    stem_counts: dict[str, int] = {}
+    for _, doc_path in doc_paths:
+        stem = doc_path.stem
+        stem_counts[stem] = stem_counts.get(stem, 0) + 1
+    duplicate_stems = {s for s, c in stem_counts.items() if c > 1}
+
+    # Accumulate token usage across all API calls
+    total_usage = TokenUsage()
+
+    # Process each file
+    num_files = len(doc_paths)
+    for file_idx, (doc_file, doc_path) in enumerate(doc_paths, 1):
+        if num_files > 1:
+            print(
+                f"\n{'=' * 60}",
+                file=sys.stderr,
+            )
+            print(
+                f"Processing file {file_idx}/{num_files}: {doc_file}",
+                file=sys.stderr,
+            )
+            print(
+                f"{'=' * 60}",
+                file=sys.stderr,
+            )
+
+        # Determine output filenames (disambiguate if stems collide)
+        doc_basename = doc_path.stem
+        if doc_basename in duplicate_stems:
+            # Prefix with parent directory name to avoid clobbering
+            parent = doc_path.parent.name or "root"
+            doc_basename = f"{parent}-{doc_basename}"
+        diff_file = output_dir / f"{doc_basename}.diff"
+        msg_file = output_dir / f"{doc_basename}.msg"
+
+        # Get commit prefix
+        commit_prefix = get_commit_prefix(doc_file)
+
+        # Read doc content
+        doc_content = doc_path.read_text()
+
+        if args.verbose:
+            print("=== Request ===", file=sys.stderr)
+            print(f"Provider: {args.provider}", file=sys.stderr)
+            print(f"Model: {model}", file=sys.stderr)
+            print(f"Output format: {args.output_format}", file=sys.stderr)
+            print(f"AGENTS file: {args.agents}", file=sys.stderr)
+            print(f"Doc file: {doc_file}", file=sys.stderr)
+            print(f"Commit prefix: {commit_prefix}", file=sys.stderr)
+            print(f"Output dir: {args.output_dir}", file=sys.stderr)
+            if args.send_email:
+                print("Send email: yes", file=sys.stderr)
+                print(f"To: {', '.join(args.to_addrs)}", file=sys.stderr)
+                if args.cc_addrs:
+                    print(f"Cc: {', '.join(args.cc_addrs)}", file=sys.stderr)
+                print(f"From: {from_addr}", file=sys.stderr)
+            print("===============", file=sys.stderr)
+
+        # Call API
+        review_text, call_usage = call_api(
+            args.provider,
+            api_key,
+            model,
+            args.tokens,
+            agents_content,
+            doc_content,
+            doc_file,
+            commit_prefix,
+            args.output_format,
+            args.diff,
+            args.verbose,
+            args.timeout,
+        )
+        total_usage.add(call_usage)
+
+        if not review_text:
+            print(
+                f"Warning: No response received for {doc_file}",
+                file=sys.stderr,
+            )
+            continue
+
+        # Determine review output file
+        format_ext = FORMAT_EXTENSIONS[args.output_format]
+        review_file = output_dir / f"{doc_basename}{format_ext}"
+
+        # Determine if we should write diff/msg files
+        write_diff_msg = args.diff or args.output_format in ("text", "json")
+
+        # Extract commit message and diff first (before stripping markers)
+        commit_msg, diff = "", ""
+        if write_diff_msg:
+            if args.output_format == "json":
+                # Will extract from JSON below
+                pass
+            else:
+                # Parse from text format markers
+                commit_msg, diff = parse_review_text(review_text)
+
+        # For non-text formats with --diff, strip the markers from display output
+        display_text = review_text
+        if args.diff and args.output_format in ("markdown", "html"):
+            display_text = strip_diff_markers(review_text)
+
+        # Build formatted output text
+        if args.output_format == "text":
+            output_text = review_text
+        elif args.output_format == "json":
+            # Try to parse JSON response
+            try:
+                review_data = json.loads(review_text)
+            except json.JSONDecodeError:
+                print("Warning: Response is not valid JSON", file=sys.stderr)
+                review_data = {"raw_response": review_text}
+
+            # Extract diff/msg from JSON if present
+            if write_diff_msg:
+                if isinstance(review_data, dict) and "raw_response" not in review_data:
+                    commit_msg = review_data.get("commit_message", "")
+                    diff = review_data.get("diff", "")
+
+            # Add metadata
+            usage_data = {
+                "api_calls": call_usage.api_calls,
+                "input_tokens": call_usage.input_tokens,
+                "output_tokens": call_usage.output_tokens,
+                "total_tokens": call_usage.input_tokens + call_usage.output_tokens,
+            }
+            if call_usage.cache_creation_tokens:
+                usage_data["cache_creation_tokens"] = call_usage.cache_creation_tokens
+            if call_usage.cache_read_tokens:
+                usage_data["cache_read_tokens"] = call_usage.cache_read_tokens
+            if args.show_costs:
+                usage_data["estimated_cost_usd"] = round(
+                    estimate_cost(call_usage, args.provider, model), 6
+                )
+
+            output_data = {
+                "metadata": {
+                    "doc_file": doc_file,
+                    "provider": args.provider,
+                    "provider_name": provider_name,
+                    "model": model,
+                    "commit_prefix": commit_prefix,
+                    "token_usage": usage_data,
+                },
+                "review": review_data,
+            }
+            output_text = json.dumps(output_data, indent=2)
+        elif args.output_format == "markdown":
+            output_text = f"""# Documentation Review: {doc_path.name}
+
+*Reviewed by {provider_name} ({model})*
+
+{display_text}
+"""
+        elif args.output_format == "html":
+            output_text = f"""<!DOCTYPE html>
+<html>
+<head>
+<meta charset="utf-8">
+<title>Review: {doc_path.name}</title>
+<style>
+body {{ font-family: system-ui, sans-serif; max-width: 900px; margin: 2em auto; padding: 0 1em; }}
+h1 {{ color: #333; }}
+.review-meta {{ color: #666; font-style: italic; }}
+pre {{ background: #f5f5f5; padding: 1em; overflow-x: auto; }}
+</style>
+</head>
+<body>
+<h1>Documentation Review: {doc_path.name}</h1>
+<p class="review-meta">Reviewed by {provider_name} ({model})</p>
+<div class="review-content">
+{display_text}
+</div>
+</body>
+</html>
+"""
+
+        # Write formatted review to file
+        review_file.write_text(output_text)
+        print(f"Review written to: {review_file}", file=sys.stderr)
+
+        # Write diff/msg files
+        if write_diff_msg:
+            if commit_msg:
+                msg_file.write_text(commit_msg + "\n")
+                print(f"Commit message written to: {msg_file}", file=sys.stderr)
+            else:
+                msg_file.write_text("# No commit message generated\n")
+                print("Warning: Could not extract commit message", file=sys.stderr)
+
+            if diff:
+                diff_file.write_text(diff + "\n")
+                print(f"Diff written to: {diff_file}", file=sys.stderr)
+            else:
+                diff_file.write_text("# No changes suggested\n")
+                print("Warning: Could not extract diff", file=sys.stderr)
+
+        # Print to stdout unless quiet (or multiple files without verbose)
+        show_stdout = not args.quiet and (num_files == 1 or args.verbose)
+        if show_stdout:
+            print(
+                f"\n=== Documentation Review: {doc_path.name} "
+                f"(via {provider_name}) ==="
+            )
+            print(output_text)
+
+            # Print usage instructions for text format
+            if args.output_format == "text":
+                print("\n=== Output Files ===")
+                print(f"Commit message: {msg_file}")
+                print(f"Diff file:      {diff_file}")
+                print("\nTo apply changes:")
+                print(f"  git apply {diff_file}")
+                print(f"  git commit -sF {msg_file}")
+
+        # Send email if requested
+        if args.send_email:
+            if args.output_format != "text":
+                print(
+                    f"Note: Email will be sent as plain text regardless of "
+                    f"--format={args.output_format}",
+                    file=sys.stderr,
+                )
+
+            review_subject = f"[REVIEW] {commit_prefix} {doc_path.name}"
+
+            # Build email body
+            email_body = f"""AI-generated documentation review of {doc_file}
+Reviewed using {provider_name} ({model})
+
+This is an automated review. Please verify all suggestions.
+
+---
+
+{review_text}
+"""
+
+            if args.verbose:
+                print("", file=sys.stderr)
+                print("=== Email Details ===", file=sys.stderr)
+                print(f"Subject: {review_subject}", file=sys.stderr)
+                print("=====================", file=sys.stderr)
+
+            send_email(
+                args.to_addrs,
+                args.cc_addrs,
+                from_addr,
+                review_subject,
+                None,
+                email_body,
+                args.dry_run,
+                args.verbose,
+            )
+
+            if not args.dry_run:
+                print("", file=sys.stderr)
+                print(f"Review sent to: {', '.join(args.to_addrs)}", file=sys.stderr)
+
+    # Print summary for multiple files
+    if num_files > 1:
+        print(f"\n{'=' * 60}", file=sys.stderr)
+        print(f"Processed {num_files} files", file=sys.stderr)
+        print(f"Output directory: {output_dir}", file=sys.stderr)
+
+    # Print token usage summary
+    if total_usage.api_calls > 0:
+        print("", file=sys.stderr)
+        print(
+            format_token_summary(
+                total_usage, args.provider, model, args.show_costs
+            ),
+            file=sys.stderr,
+        )
+
+
+if __name__ == "__main__":
+    main()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v13 5/6] doc: add AI-assisted patch review to contributing guide
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (3 preceding siblings ...)
  2026-04-02 19:44     ` [PATCH v13 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
@ 2026-04-02 19:44     ` Stephen Hemminger
  2026-04-02 19:44     ` [PATCH v13 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-02 19:44 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add a new section to the contributing guide describing the
analyze-patch.py script which uses AI providers to review patches
against DPDK coding standards before submission to the mailing list.

The new section covers basic usage, provider selection, patch series
handling, LTS release review, and output format options. A note
clarifies that AI review supplements but does not replace human
review.

Also add a reference to the script in the new driver guide's
test tools checklist.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/contributing/new_driver.rst |  2 +
 doc/guides/contributing/patches.rst    | 59 ++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/guides/contributing/new_driver.rst b/doc/guides/contributing/new_driver.rst
index 555e875329..6c0d356cfd 100644
--- a/doc/guides/contributing/new_driver.rst
+++ b/doc/guides/contributing/new_driver.rst
@@ -210,3 +210,5 @@ Be sure to run the following test tools per patch in a patch series:
 * `check-doc-vs-code.sh`
 * `check-spdx-tag.sh`
 * Build documentation and validate how output looks
+* Optionally run ``analyze-patch.py`` for AI-assisted review
+  (see :ref:`ai_assisted_review` in the Contributing Guide)
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index 5f554d47e6..1e50799c19 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -183,6 +183,10 @@ Make your planned changes in the cloned ``dpdk`` repo. Here are some guidelines
 
 * Code and related documentation must be updated atomically in the same patch.
 
+* Consider running the :ref:`AI-assisted review <ai_assisted_review>` tool
+  before submitting to catch common issues early.
+  This is encouraged but not required.
+
 Once the changes have been made you should commit them to your local repo.
 
 For small changes, that do not require specific explanations, it is better to keep things together in the
@@ -503,6 +507,61 @@ Additionally, when contributing to the DTS tool, patches should also be checked
 the ``dts-check-format.sh`` script in the ``devtools`` directory of the DPDK repo.
 To run the script, extra :ref:`Python dependencies <dts_deps>` are needed.
 
+
+.. _ai_assisted_review:
+
+AI-Assisted Patch Review
+------------------------
+
+Contributors may optionally use the ``devtools/analyze-patch.py`` script
+to get an AI-assisted review of patches before submitting them to the mailing list.
+The script checks patches against the DPDK coding standards and contribution
+guidelines documented in ``AGENTS.md``.
+
+The script supports multiple AI providers (Anthropic Claude, OpenAI ChatGPT,
+xAI Grok, Google Gemini).  An API key for the chosen provider must be set
+in the corresponding environment variable (see ``--list-providers``).
+
+Basic usage::
+
+   # Review a single patch (default provider: Anthropic Claude)
+   devtools/analyze-patch.py my-patch.patch
+
+   # Use a different provider
+   devtools/analyze-patch.py -p openai my-patch.patch
+
+   # Review for an LTS branch (enables stricter rules)
+   devtools/analyze-patch.py -r 24.11 my-patch.patch
+
+   # List available providers and their API key variables
+   devtools/analyze-patch.py --list-providers
+
+For a patch series in an mbox file, the ``--split-patches`` option reviews
+each patch individually::
+
+   devtools/analyze-patch.py --split-patches series.mbox
+
+   # Review only a range of patches
+   devtools/analyze-patch.py --split-patches --patch-range 1-5 series.mbox
+
+When reviewing for a Long Term Stable (LTS) release, use the ``-r`` option
+with the target version.  Any DPDK release with minor version ``.11``
+(e.g., 23.11, 24.11) is automatically recognized as LTS,
+and the script will enforce stricter rules: bug fixes only, no new features or APIs.
+
+Output can be formatted as plain text (default), Markdown, HTML, or JSON::
+
+   devtools/analyze-patch.py -f markdown -o review.md my-patch.patch
+
+The review guidelines in ``AGENTS.md`` focus on correctness bug detection
+and other DPDK-specific requirements. Commit message formatting and
+SPDX/copyright compliance are checked by ``checkpatches.sh`` and are
+not duplicated in the AI review.
+
+.. note::
+
+   Always verify AI suggestions before acting on them.
+
 .. _contrib_check_compilation:
 
 Checking Compilation
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v13 6/6] MAINTAINERS: add section for AI review tools
  2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
                       ` (4 preceding siblings ...)
  2026-04-02 19:44     ` [PATCH v13 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
@ 2026-04-02 19:44     ` Stephen Hemminger
  5 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2026-04-02 19:44 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Thomas Monjalon

Add maintainer entries for the AI-assisted code review tooling:
AGENTS.md, analyze-patch.py, compare-reviews.sh, and
review-doc.py.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 MAINTAINERS                            | 8 ++++++++
 doc/guides/rel_notes/release_26_07.rst | 5 +++++
 2 files changed, 13 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0f5539f851..c052b6c203 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -109,6 +109,14 @@ F: license/
 F: .editorconfig
 F: .mailmap
 
+AI review tools
+M: Stephen Hemminger <stephen@networkplumber.org>
+M: Aaron Conole <aconole@redhat.com>
+F: AGENTS.md
+F: devtools/analyze-patch.py
+F: devtools/compare-reviews.sh
+F: devtools/review-doc.py
+
 Linux kernel uAPI headers
 M: Maxime Coquelin <maxime.coquelin@redhat.com>
 F: devtools/linux-uapi.sh
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 060b26ff61..aa9e1e962c 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -55,6 +55,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added scripts and prompts for AI review.**
+
+  Added AGENTS.md file for AI review and supporting scripts to analyze
+  patches and review documentation.
+
 
 Removed Items
 -------------
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2026-04-02 19:47 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <0260109014106.398156-1-stephen@networkplumber.org>
2026-01-26 18:40 ` [PATCH v7 0/4] devtools: add AI-assisted code review tools Stephen Hemminger
2026-01-26 18:40   ` [PATCH v7 1/4] doc: add AGENTS.md for AI-powered " Stephen Hemminger
2026-01-30 23:49     ` Stephen Hemminger
2026-01-26 18:40   ` [PATCH v7 2/4] devtools: add multi-provider AI patch review script Stephen Hemminger
2026-01-26 18:40   ` [PATCH v7 3/4] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
2026-01-26 18:40   ` [PATCH v7 4/4] devtools: add multi-provider AI documentation review script Stephen Hemminger
2026-02-09 19:48   ` [PATCH v8 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
2026-02-09 19:48     ` [PATCH v8 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
2026-02-09 19:48     ` [PATCH v8 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
2026-02-09 19:48     ` [PATCH v8 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
2026-02-09 19:48     ` [PATCH v8 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
2026-02-09 19:48     ` [PATCH v8 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
2026-02-09 19:48     ` [PATCH v8 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
2026-03-04 17:59   ` [PATCH v9 0/6] add AGENTS.md and scripts for AI code review Stephen Hemminger
2026-03-04 17:59     ` [PATCH v9 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
2026-03-04 17:59     ` [PATCH v9 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
2026-03-04 17:59     ` [PATCH v9 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
2026-03-04 17:59     ` [PATCH v9 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
2026-03-04 17:59     ` [PATCH v9 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
2026-03-04 17:59     ` [PATCH v9 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
2026-03-10  1:57   ` [PATCH v10 0/6] Add AGENTS and scripts for AI code review Stephen Hemminger
2026-03-10  1:57     ` [PATCH v10 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
2026-03-10  1:57     ` [PATCH v10 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
2026-03-10  1:57     ` [PATCH v10 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
2026-03-10  1:57     ` [PATCH v10 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
2026-03-10  1:57     ` [PATCH v10 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
2026-03-10  1:57     ` [PATCH v10 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
2026-03-27 15:41   ` [PATCH v11 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
2026-03-27 15:41     ` [PATCH v11 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
2026-03-27 15:41     ` [PATCH v11 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
2026-03-27 15:41     ` [PATCH v11 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
2026-03-27 15:41     ` [PATCH v11 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
2026-03-27 15:41     ` [PATCH v11 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
2026-03-27 15:41     ` [PATCH v11 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
2026-04-01 15:38   ` [PATCH v12 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
2026-04-01 15:38     ` [PATCH v12 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
2026-04-01 15:38     ` [PATCH v12 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
2026-04-02  4:00       ` sunyuechi
2026-04-01 15:38     ` [PATCH v12 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
2026-04-01 15:38     ` [PATCH v12 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
2026-04-02  4:05       ` sunyuechi
2026-04-01 15:38     ` [PATCH v12 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
2026-04-01 15:38     ` [PATCH v12 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
2026-04-02 19:44   ` [PATCH v13 0/6] Add AGENTS.md and scripts for AI code review Stephen Hemminger
2026-04-02 19:44     ` [PATCH v13 1/6] doc: add AGENTS.md for AI code review tools Stephen Hemminger
2026-04-02 19:44     ` [PATCH v13 2/6] devtools: add multi-provider AI patch review script Stephen Hemminger
2026-04-02 19:44     ` [PATCH v13 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger
2026-04-02 19:44     ` [PATCH v13 4/6] devtools: add multi-provider AI documentation review script Stephen Hemminger
2026-04-02 19:44     ` [PATCH v13 5/6] doc: add AI-assisted patch review to contributing guide Stephen Hemminger
2026-04-02 19:44     ` [PATCH v13 6/6] MAINTAINERS: add section for AI review tools Stephen Hemminger
2026-02-13 21:39 [PATCH v9 0/6] devtools: AI-assisted code and documentation review Stephen Hemminger
2026-02-19 17:48 ` [PATCH v10 0/6] " Stephen Hemminger
2026-02-19 17:48   ` [PATCH v10 3/6] devtools: add compare-reviews.sh for multi-provider analysis Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox