Re: [RFC PATCH 2/5] lib: Introduce BLOG (Binary LOGging) subsystem

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
To: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Alex Markuze <amarkuze@redhat.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"lorenzo.stoakes@oracle.com" <lorenzo.stoakes@oracle.com>,
	Xiubo Li <xiubli@redhat.com>,
	"idryomov@gmail.com" <idryomov@gmail.com>,
	"david@redhat.com" <david@redhat.com>,
	"mgorman@suse.de" <mgorman@suse.de>,
	"vbabka@suse.cz" <vbabka@suse.cz>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
	Ingo Molnar <mingo@redhat.com>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"surenb@google.com" <surenb@google.com>,
	Valentin Schneider <vschneid@redhat.com>,
	"kees@kernel.org" <kees@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mhocko@suse.com" <mhocko@suse.com>,
	"bsegall@google.com" <bsegall@google.com>,
	"juri.lelli@redhat.com" <juri.lelli@redhat.com>
Subject: Re:  [RFC PATCH 2/5] lib: Introduce BLOG (Binary LOGging) subsystem
Date: Thu, 30 Oct 2025 18:47:44 +0000	[thread overview]
Message-ID: <9dc0107937fe161e14b1353722060fde02637d24.camel@ibm.com> (raw)
In-Reply-To: <20251024084259.2359693-3-amarkuze@redhat.com>

On Fri, 2025-10-24 at 08:42 +0000, Alex Markuze wrote:
> This commit introduces the BLOG (Binary LOGging) library, a high-throughput
> structured logging framework designed for efficient binary log capture in
> kernel space with minimal performance impact on hot paths.
> 
> BLOG provides a complete logging solution with the following key components:
> 
> **Core Infrastructure (blog_core.c, blog.h):**
> - Logger registration and lifecycle management
> - Per-module logger instances with isolated namespaces
> - Source ID tracking for format string registration
> - Reference-counted logger cleanup
> 
> **Memory Management (blog_batch.c, blog_pagefrag.c):**
> - Magazine-based allocator for TLS context pooling
> - Lock-free context acquisition/release on hot paths
> - Pagefrag-based circular buffer management
> - Efficient buffer recycling and capacity tracking
> 
> **Serialization Framework (blog_ser.h):**
> - Compile-time type-safe serialization macros
> - Efficient binary encoding with minimal overhead
> - Support for all primitive types, strings, and binary blobs
> - Automatic size calculation and alignment handling
> 
> **Deserialization Support (blog_des.c, blog_des.h):**
> - Iterator-based deserialization for userspace consumption
> - Format string resolution and parameter extraction
> - Entry validation and boundary checking
> - Designed for debugfs export and analysis tools
> 
> **Module Integration (blog_module.c, blog_module.h):**
> - Per-task TLS context management
> - Module-specific logger registration
> - Context list tracking for debugfs enumeration
> - Automatic context initialization and cleanup
> 
> **Configuration (lib/blog/Kconfig):**
> - CONFIG_BLOG: Master switch for BLOG subsystem
> - Default buffer size and capacity tuning options
> - Integration with kernel build system
> 
> The implementation includes all correctness fixes for the serialization path:
> - Proper handling of parameter-less log messages
> - Correct reserve/publish size rounding to prevent buffer overlap
> - Accurate entry length tracking for deserialization
> - Proper locking for context snapshots to prevent deadlocks
> 
> After this commit, BLOG builds cleanly when CONFIG_BLOG is enabled but
> remains unused by the kernel. No logging is performed until subsystems
> explicitly register loggers and wire in the TLS context hooks. Subsequent
> commits will integrate BLOG into the Ceph filesystem as a reference
> implementation.
> 
> Signed-off-by: Alex Markuze <amarkuze@redhat.com>
> ---
>  include/linux/blog/blog.h          | 515 +++++++++++++++++++
>  include/linux/blog/blog_batch.h    |  54 ++
>  include/linux/blog/blog_des.h      |  46 ++
>  include/linux/blog/blog_module.h   | 329 ++++++++++++
>  include/linux/blog/blog_pagefrag.h |  33 ++
>  include/linux/blog/blog_ser.h      | 275 ++++++++++
>  lib/Kconfig                        |   2 +
>  lib/Makefile                       |   2 +
>  lib/blog/Kconfig                   |  56 +++
>  lib/blog/Makefile                  |  15 +
>  lib/blog/blog_batch.c              | 311 ++++++++++++
>  lib/blog/blog_core.c               | 772 ++++++++++++++++++++++++++++
>  lib/blog/blog_des.c                | 385 ++++++++++++++
>  lib/blog/blog_module.c             | 781 +++++++++++++++++++++++++++++
>  lib/blog/blog_pagefrag.c           | 124 +++++
>  15 files changed, 3700 insertions(+)
>  create mode 100644 include/linux/blog/blog.h
>  create mode 100644 include/linux/blog/blog_batch.h
>  create mode 100644 include/linux/blog/blog_des.h
>  create mode 100644 include/linux/blog/blog_module.h
>  create mode 100644 include/linux/blog/blog_pagefrag.h
>  create mode 100644 include/linux/blog/blog_ser.h
>  create mode 100644 lib/blog/Kconfig
>  create mode 100644 lib/blog/Makefile
>  create mode 100644 lib/blog/blog_batch.c
>  create mode 100644 lib/blog/blog_core.c
>  create mode 100644 lib/blog/blog_des.c
>  create mode 100644 lib/blog/blog_module.c
>  create mode 100644 lib/blog/blog_pagefrag.c
> 
> diff --git a/include/linux/blog/blog.h b/include/linux/blog/blog.h
> new file mode 100644
> index 000000000000..6d93ff31732e
> --- /dev/null
> +++ b/include/linux/blog/blog.h
> @@ -0,0 +1,515 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Binary Logging Infrastructure (BLOG)
> + *
> + * Generic binary logging infrastructure for kernel subsystems.
> + * Modules maintain their own client mappings and debugfs interfaces.
> + */

Do we need to add the copyright statement in all created files?

> +#ifndef _LINUX_BLOG_H
> +#define _LINUX_BLOG_H
> +
> +#include <linux/types.h>
> +#include <linux/sched.h>
> +#include <linux/list.h>
> +#include <linux/spinlock.h>
> +#include <linux/blog/blog_batch.h>
> +#include <linux/blog/blog_pagefrag.h>
> +#include <linux/blog/blog_ser.h>
> +#include <linux/blog/blog_des.h>
> +
> +struct blog_module_context;
> +
> +/* Debug configuration */
> +#ifdef CONFIG_BLOG_DEBUG
> +#define BLOG_DEBUG_POISON 1
> +#else
> +#define BLOG_DEBUG_POISON 0
> +#endif

What's the point to define BLOG_DEBUG_POISON? It looks like that
CONFIG_BLOG_DEBUG == BLOG_DEBUG_POISON. You can use CONFIG_BLOG_DEBUG instead of
BLOG_DEBUG_POISON.

> +
> +#ifdef CONFIG_BLOG_TRACK_USAGE
> +#define BLOG_TRACK_USAGE 1
> +#else
> +#define BLOG_TRACK_USAGE 0
> +#endif
> +

Ditto. You can use CONFIG_BLOG_TRACK_USAGE instead of declaration the
BLOG_TRACK_USAGE.

> +/* Debug poison values */
> +#if BLOG_DEBUG_POISON
> +#define BLOG_LOG_ENTRY_POISON 0xD1E7C0DE
> +#define BLOG_CTX_POISON 0xCAFEBABE
> +#endif
> +
> +/* No global logger - all logging must use per-module contexts */
> +
> +/* Maximum values */
> +#define BLOG_MAX_PAYLOAD 255
> +#ifdef CONFIG_BLOG_MAX_SOURCES
> +#define BLOG_MAX_SOURCE_IDS CONFIG_BLOG_MAX_SOURCES
> +#else
> +#define BLOG_MAX_SOURCE_IDS 4096
> +#endif
> +#ifdef CONFIG_BLOG_MAX_CLIENTS
> +#define BLOG_MAX_CLIENT_IDS CONFIG_BLOG_MAX_CLIENTS
> +#else
> +#define BLOG_MAX_CLIENT_IDS 256
> +#endif

Technically speaking, you can define a set of values for CONFIG_BLOG_MAX_SOURCES
and CONFIG_BLOG_MAX_CLIENTS in Kconfig. It doesn't need to make this set of
declarations here.

> +
> +/**
> + * struct blog_source_info - Source location metadata for log entries
> + * @file: Source file name (from __FILE__)
> + * @func: Function name (from __func__)
> + * @line: Line number (from __LINE__)
> + * @fmt: Printf-style format string for this log site
> + * @warn_count: Number of warnings issued from this site
> + * @napi_usage: Number of times logged from NAPI context (if BLOG_TRACK_USAGE)

What is the NAPI?

> + * @task_usage: Number of times logged from task context (if BLOG_TRACK_USAGE)
> + * @napi_bytes: Total bytes logged from NAPI (if BLOG_TRACK_USAGE)
> + * @task_bytes: Total bytes logged from task (if BLOG_TRACK_USAGE)
> + *
> + * Maps source IDs to their original source locations. One entry per unique
> + * file:func:line location. The format string is stored here for use during
> + * deserialization to reconstruct the original log message.

Does it mean that kernel should be compiled with debug symbols for BLOG
subsystem?

> + */
> +struct blog_source_info {
> +	const char *file;
> +	const char *func;
> +	unsigned int line;
> +	const char *fmt;
> +	int warn_count;

Why the warn_count is not under BLOG_TRACK_USAGE? And why it is not atomic_t
type?

> +#if BLOG_TRACK_USAGE
> +	atomic_t napi_usage;
> +	atomic_t task_usage;
> +	atomic_t napi_bytes;
> +	atomic_t task_bytes;
> +#endif
> +};
> +
> +/**
> + * struct blog_log_entry - Binary log entry header and payload
> + * @debug_poison: Magic value for corruption detection (if BLOG_DEBUG_POISON)
> + * @ts_delta: Timestamp delta from context's base_jiffies
> + * @source_id: Source location ID (index into source_map)
> + * @len: Length of serialized data in buffer (max 255)
> + * @client_id: Module-specific client identifier
> + * @flags: Reserved for future use
> + * @buffer: Variable-length serialized argument data
> + *
> + * Wire format for a single log entry. Entries are stored sequentially in
> + * the pagefrag buffer. The buffer contains binary-serialized arguments
> + * that match the format string stored in source_map[source_id].
> + */
> +struct blog_log_entry {
> +#if BLOG_DEBUG_POISON
> +	u64 debug_poison;
> +#endif
> +	u32 ts_delta;
> +	u16 source_id;
> +	u8 len;

Why len is not near buffer?

> +	u8 client_id;
> +	u8 flags;

As far as I can see, this data structure is not properly aligned. It will be
better to start the buffer on 8-bytes aligned boundary.

> +	char buffer[];
> +};
> +
> +/**
> + * struct blog_tls_ctx - Per-task (or NAPI) logging context
> + * @list: Linkage in logger's contexts list
> + * @release: Cleanup function called on context destruction
> + * @refcount: Reference count (0=in batch, 1=active)
> + * @task: Associated task (NULL for NAPI contexts)
> + * @pid: Process ID of associated task
> + * @comm: Command name of associated task
> + * @id: Unique context ID (for debugging)
> + * @debug_poison: Magic value for corruption detection (if BLOG_DEBUG_POISON)
> + * @base_jiffies: Base timestamp for delta calculation
> + * @logger: Parent logger instance
> + *
> + * Each task (or CPU for NAPI) has its own logging context. The context
> + * is embedded in a blog_tls_pagefrag composite which contains the inline
> + * 512KB buffer. Use blog_ctx_pf() to access the pagefrag allocator.
> + * Contexts are recycled through magazine batching system.
> + *
> + * NOTE: struct blog_pagefrag pf has been REMOVED from this structure.
> + * It now lives in the composite (blog_tls_pagefrag). Use blog_ctx_pf(ctx)
> + * to access it. This ensures the pagefrag points to the inline buffer.
> + */
> +struct blog_tls_ctx {
> +	struct list_head list;
> +	void (*release)(void *data);
> +	atomic_t refcount;

Why not refcount_t?

> +	struct task_struct *task;
> +	pid_t pid;
> +	char comm[TASK_COMM_LEN];
> +	u64 id;
> +	u64 debug_poison;

Should id and debug_poison be under CONFIG_BLOG_DEBUG?

Usually, poison value is located at the beginning or at the end of the
structure. Probably, it makes sense to place debug_poison at the structure's
end?

> +	unsigned long base_jiffies;
> +	struct blog_logger *logger;
> +	int pending_offset;	/* Offset of reserved but not yet published entry */
> +	size_t pending_size;	/* Size of reserved entry */
> +};

I think, maybe, it makes sense to reorder fields of the structure:

struct blog_tls_ctx {
	struct list_head list;
	atomic_t refcount;
	pid_t pid;
	struct task_struct *task;
	struct blog_logger *logger;
	int pending_offset;
	size_t pending_size;
	void (*release)(void *data);
	char comm[TASK_COMM_LEN];
	u64 id;
	u64 debug_poison;
};

> +
> +/**
> + * struct blog_tls_pagefrag - Composite allocation unit for TLS contexts
> + * @ctx: TLS context metadata (refcount, release hook, task info)
> + * @pf: Embedded pagefrag allocator (counters + lock)
> + * @buf: Flexible array for log entries
> + *
> + * This composite structure combines the TLS context header with its
> + * logging buffer into a single allocation unit. Total allocation is
> + * exactly BLOG_PAGEFRAG_SIZE (512KB), with the buffer occupying the
> + * remainder after the header: BLOG_PAGEFRAG_SIZE - sizeof(struct blog_tls_pagefrag).
> + *
> + * The composites flow through the batching system:
> + * alloc_batch (empty) → in-use (filling) → log_batch (full, readable)
> + * → drain → reset → back to alloc_batch.
> + *
> + * The buffer ensures no separate alloc_pages() calls are needed
> + * on the hot path. The pagefrag's buffer pointer is initialized to point
> + * at buf[], and serialization starts at offset 0 within buf[] (which is
> + * already positioned after the metadata).
> + */
> +struct blog_tls_pagefrag {
> +	struct blog_tls_ctx ctx;
> +	struct blog_pagefrag pf;
> +	unsigned char buf[];
> +};
> +
> +/* Total allocation is exactly 512KB - buffer size is the remainder after header */
> +#define BLOG_TLS_PAGEFRAG_ALLOC_SIZE BLOG_PAGEFRAG_SIZE
> +#define BLOG_TLS_PAGEFRAG_BUFFER_SIZE \
> +	(BLOG_PAGEFRAG_SIZE - sizeof(struct blog_tls_pagefrag))
> +
> +/**
> + * struct blog_logger - Per-module logger instance
> + * @contexts: List of all TLS contexts for this logger
> + * @lock: Protects contexts list
> + * @alloc_batch: Magazine batch for context allocation
> + * @log_batch: Magazine batch for completed log contexts
> + * @source_map: Array mapping source IDs to source info (max 4096)
> + * @next_source_id: Next source ID to assign
> + * @source_lock: Protects source map operations
> + * @total_contexts_allocated: Total number of contexts created
> + * @next_ctx_id: Next context ID to assign
> + * @ctx_id_lock: Protects context ID counter
> + * @napi_ctxs: Per-CPU NAPI context pointers
> + *
> + * Each module has its own logger instance with isolated source ID space,
> + * context list, and batching system. Composites are allocated via alloc_pages()
> + * since they're 512KB each - too large for slab allocator.
> + */
> +struct blog_logger {
> +	struct list_head contexts;
> +	spinlock_t lock;		/* protects contexts list */

Should this lock to be before the contexts declaration?

> +	struct blog_batch alloc_batch;
> +	struct blog_batch log_batch;
> +	struct kmem_cache *magazine_cache; /* Shared cache for magazine structs */

If it is shared, then which lock protects the magazine_cache from race
conditions?

> +	struct blog_source_info
> +		source_map[BLOG_MAX_SOURCE_IDS];

What's the point to make it the two lines declaration?

> +	atomic_t next_source_id;
> +	spinlock_t source_lock;	/* protects source_ids table */

What is source_ids table? And where it is located?

> +	unsigned long total_contexts_allocated;
> +	u64 next_ctx_id;

Does next_ctx_id protected by source_lock?

> +	spinlock_t ctx_id_lock;	/* protects next_ctx_id */

If this lock protects next_ctx_id, then why it is not declared before the
next_ctx_id?

> +	struct blog_tls_ctx *__percpu
> +		*napi_ctxs;

What's the point to make it the two lines declaration?

> +	u8 slot_id;
> +	bool has_slot;
> +	struct blog_module_context *owner_ctx;
> +};
> +
> +/**
> + * struct blog_log_iter - Iterator for reading log entries from a pagefrag
> + * @pf: Pagefrag being iterated
> + * @current_offset: Current read position in pagefrag
> + * @end_offset: End position (pf->head at iteration start)
> + * @prev_offset: Previous offset (for debugging)
> + * @steps: Number of entries iterated so far
> + *
> + * Used to walk through all log entries in a pagefrag sequentially.
> + * Initialize with blog_log_iter_init(), then call blog_log_iter_next()
> + * repeatedly until it returns NULL.
> + */
> +struct blog_log_iter {
> +	struct blog_pagefrag *pf;
> +	u64 current_offset;
> +	u64 end_offset;
> +	u64 prev_offset;

Should prev_offset be under CONFIG_BLOG_DEBUG?

> +	u64 steps;

I assume that we have the variable-sized portions in the pagefrag. So, I am not
sure how useful could be the number of iterated items. If I know the start
offset, then I can repeat the same steps. But number of iterated entries don't
provide the way to re-calculate the start offset because of variable-sized
nature of items in the sequence.

> +};
> +
> +/* Client deserialization callback type */
> +typedef int (*blog_client_des_fn)(char *buf, size_t size, u8 client_id);
> +
> +/* Core API functions - all require valid logger parameter */
> +
> +/**
> + * blog_get_source_id - Get or allocate a unique source ID for a log location
> + * @logger: Logger instance to use
> + * @file: Source file name (typically kbasename(__FILE__))
> + * @func: Function name (typically __func__)
> + * @line: Line number (typically __LINE__)
> + * @fmt: Printf-style format string for this log site
> + *
> + * Assigns a unique source ID to a specific file:func:line location. The ID
> + * is typically cached in a static variable at the call site for fast lookup.
> + * The format string is stored in the logger's source map for later
> + * deserialization.
> + *
> + * Context: Can be called from any context (process, softirq, hardirq)
> + * Return: Source ID (1 to BLOG_MAX_SOURCE_IDS-1), or 0 on error
> + */
> +u32 blog_get_source_id(struct blog_logger *logger, const char *file,
> +		       const char *func, unsigned int line, const char *fmt);
> +
> +/**
> + * blog_get_source_info - Retrieve source information for a given source ID
> + * @logger: Logger instance to query
> + * @id: Source ID to look up
> + *
> + * Retrieves the file, function, line, and format string associated with
> + * a source ID. Used during deserialization to reconstruct log messages.
> + *
> + * Context: Any context
> + * Return: Pointer to source_info structure, or NULL if ID is invalid
> + */
> +struct blog_source_info *blog_get_source_info(struct blog_logger *logger,
> +					      u32 id);
> +
> +/**
> + * blog_log - Reserve buffer for a binary log message
> + * @logger: Logger instance to use
> + * @source_id: Source ID for this log location (from blog_get_source_id)
> + * @client_id: Module-specific client identifier (0 if not used)
> + * @needed_size: Size in bytes needed for serialized arguments
> + *
> + * Reserves space in the current context's pagefrag for a log entry and
> + * returns a buffer pointer for the caller to serialize arguments into.
> + * The log entry header (timestamp, source_id, etc.) is filled automatically.
> + *
> + * The reserved space is NOT visible to readers until blog_log_commit() is
> + * called after serialization completes. This prevents readers from seeing
> + * partially-written entries.
> + *
> + * If allocation fails, the pagefrag is reset and retried up to 3 times.
> + * Payload must not exceed BLOG_MAX_PAYLOAD (255 bytes).
> + *
> + * Context: Process or softirq (automatically selects appropriate context)
> + * Return: Buffer pointer to write serialized data, or NULL on failure
> + */
> +void *blog_log(struct blog_logger *logger, u32 source_id, u8 client_id,
> +	       size_t needed_size);
> +
> +/**
> + * blog_log_commit - Publish a reserved log entry
> + * @logger: Logger instance
> + * @actual_size: Actual bytes written during serialization
> + *
> + * Publishes the log entry that was reserved by the last blog_log() call,
> + * making it visible to readers. Must be called after serialization is complete.
> + *
> + * The actual_size should be the number of bytes actually written, which may be
> + * less than the worst-case size reserved (e.g., strings may be shorter than 255).
> + *
> + * Uses memory barrier to ensure all writes are visible before making entry
> + * visible to readers.
> + *
> + * Context: Same context as the preceding blog_log() call
> + * Return: 0 on success, negative error code on failure
> + */
> +int blog_log_commit(struct blog_logger *logger, size_t actual_size);
> +
> +/**
> + * blog_get_tls_ctx - Get or create per-task logging context
> + * @logger: Logger instance to use
> + *
> + * Returns the BLOG context for the current task, creating it if needed.
> + * Each task has a 512KB pagefrag buffer for logging. This function should
> + * not be called directly - use blog_get_ctx() instead.

If this function should not be called directly, then what is the point to
declare it here? 

> + *
> + * Context: Process context only (uses current task)
> + * Return: TLS context pointer, or NULL on allocation failure
> + */
> +struct blog_tls_ctx *blog_get_tls_ctx(struct blog_logger *logger);
> +
> +/**
> + * blog_get_napi_ctx - Get NAPI logging context for current CPU
> + * @logger: Logger instance to use
> + *
> + * Returns the NAPI (softirq) context for the current CPU. NAPI contexts
> + * must be explicitly set via blog_set_napi_ctx() before use.
> + *
> + * Context: Softirq context
> + * Return: NAPI context pointer, or NULL if not set
> + */
> +struct blog_tls_ctx *blog_get_napi_ctx(struct blog_logger *logger);
> +
> +/**
> + * blog_set_napi_ctx - Set NAPI logging context for current CPU
> + * @logger: Logger instance
> + * @ctx: Context to associate with this CPU's NAPI processing
> + *
> + * Associates a logging context with the current CPU for use during
> + * softirq (NAPI) processing. This allows network drivers and other
> + * softirq handlers to log without accessing per-task contexts.
> + *
> + * Context: Any context (typically called during initialization)
> + * Return: void
> + */
> +void blog_set_napi_ctx(struct blog_logger *logger, struct blog_tls_ctx *ctx);
> +
> +/**
> + * blog_get_ctx - Get appropriate logging context based on execution context
> + * @logger: Logger instance to use
> + *
> + * Automatically selects the correct context:
> + * - Softirq context: Returns NAPI context (or falls back to TLS)
> + * - Process context: Returns per-task TLS context
> + *
> + * This is the recommended function for getting contexts.
> + *
> + * Context: Any context
> + * Return: Logging context pointer, or NULL on failure
> + */
> +struct blog_tls_ctx *blog_get_ctx(struct blog_logger *logger);
> +
> +/**
> + * blog_log_iter_init - Initialize iterator for reading log entries
> + * @iter: Iterator structure to initialize
> + * @pf: Pagefrag containing log entries to iterate over
> + * @head_snapshot: Snapshot of pf->head at lock acquisition time
> + *
> + * Prepares an iterator to walk through all log entries in a pagefrag.
> + * The head_snapshot parameter defines the boundary - only entries up to
> + * this offset will be iterated. Caller must hold pf->lock throughout
> + * iteration to prevent buffer reset/recycling.
> + *
> + * Context: Any context
> + * Return: void
> + */
> +void blog_log_iter_init(struct blog_log_iter *iter, struct blog_pagefrag *pf,
> +			u64 head_snapshot);
> +
> +/**
> + * blog_log_iter_next - Get next log entry from iterator
> + * @iter: Iterator previously initialized with blog_log_iter_init()
> + *
> + * Advances the iterator to the next log entry in the pagefrag.
> + * Entries are returned in chronological order (order they were logged).
> + *
> + * IMPORTANT: Caller must hold pf->lock for the entire duration of iteration
> + * (from blog_log_iter_init() through all blog_log_iter_next() calls until
> + * iteration completes). This prevents blog_pagefrag_reset() from clearing
> + * the buffer mid-iteration. Writers remain lockless and never take pf->lock.
> + *
> + * Context: Any context, with pf->lock held
> + * Return: Pointer to next log entry, or NULL when no more entries
> + */
> +struct blog_log_entry *blog_log_iter_next(struct blog_log_iter *iter);
> +
> +/**
> + * blog_des_entry - Deserialize a log entry into human-readable format
> + * @logger: Logger instance (for source map lookup)
> + * @entry: Log entry to deserialize
> + * @output: Buffer to write formatted string to
> + * @out_size: Size of output buffer in bytes
> + * @client_cb: Optional callback to handle module-specific client_id formatting
> + *
> + * Reconstructs a formatted log message from binary log entry. Uses the
> + * source_id to look up the format string, then deserializes the entry's
> + * buffer according to the format specifiers.
> + *
> + * If client_cb is provided, it's called to format the client_id prefix.
> + * Otherwise, client_id is ignored.
> + *
> + * Context: Any context
> + * Return: Number of bytes written to output buffer, or negative error code
> + */
> +int blog_des_entry(struct blog_logger *logger, struct blog_log_entry *entry,
> +		   char *output, size_t out_size,
> +		   blog_client_des_fn client_cb);
> +
> +/**
> + * blog_logger_print_stats - Print logger statistics to kernel log
> + * @logger: Logger instance to print stats for
> + *
> + * Debug helper that prints current state of logger's batching system
> + * and context counts. Output goes to kernel log at debug level.
> + *
> + * Context: Any context
> + * Return: void
> + */
> +static inline void blog_logger_print_stats(struct blog_logger *logger)
> +{
> +	pr_debug(
> +		"blog: total_contexts=%lu, alloc_batch={empty=%d, full=%d}, log_batch={empty=%d, full=%d}\n",
> +		logger->total_contexts_allocated, logger->alloc_batch.nr_empty,
> +		logger->alloc_batch.nr_full, logger->log_batch.nr_empty,
> +		logger->log_batch.nr_full);

Do we have multiple logger instances? If so, then how can we distinguish the
output from different threads?

> +}
> +
> +/**
> + * blog_is_valid_kernel_addr - Check if address is in valid kernel range
> + * @addr: Address to validate
> + *
> + * Verifies that an address points to valid kernel memory using
> + * virt_addr_valid(). Used internally for sanity checking.
> + *
> + * Context: Any context
> + * Return: true if address is valid, false otherwise
> + */
> +bool blog_is_valid_kernel_addr(const void *addr);
> +
> +/**
> + * blog_ctx_container - Get composite container from TLS context pointer
> + * @ctx: TLS context pointer
> + *
> + * Returns the containing blog_tls_pagefrag composite that holds this context.
> + * Used internally to access the inline buffer and pagefrag fields.
> + *
> + * Context: Any context
> + * Return: Pointer to containing blog_tls_pagefrag composite
> + */
> +static inline struct blog_tls_pagefrag *blog_ctx_container(struct blog_tls_ctx *ctx)
> +{
> +	return container_of(ctx, struct blog_tls_pagefrag, ctx);
> +}
> +
> +/**
> + * blog_ctx_buffer - Get inline buffer pointer from TLS context
> + * @ctx: TLS context pointer
> + *
> + * Returns pointer to the 512KB inline buffer within the composite.
> + * This buffer is where log entries are stored.
> + *
> + * Context: Any context
> + * Return: Pointer to inline buffer
> + */
> +static inline void *blog_ctx_buffer(struct blog_tls_ctx *ctx)

Does buf have unsigned char data type?

> +{
> +	return blog_ctx_container(ctx)->buf;
> +}
> +
> +/**
> + * blog_ctx_pf - Get pagefrag structure from TLS context
> + * @ctx: TLS context pointer
> + *
> + * Returns pointer to the embedded pagefrag allocator within the composite.
> + * The pagefrag's buffer pointer is initialized to point at the inline buffer.
> + *
> + * Context: Any context
> + * Return: Pointer to embedded pagefrag structure
> + */
> +static inline struct blog_pagefrag *blog_ctx_pf(struct blog_tls_ctx *ctx)
> +{
> +	return &blog_ctx_container(ctx)->pf;
> +}
> +
> +/*
> + * No global logging macros - all logging must use per-module contexts
> + * Use BLOG_LOG_CTX() and BLOG_LOG_CLIENT_CTX() from blog_module.h instead
> + */
> +
> +/*
> + * These low-level logger macros are deprecated.
> + * Use BLOG_LOG_CTX() and BLOG_LOG_CLIENT_CTX() from blog_module.h instead.
> + */

These comments look strange here.

> +
> +#endif /* _LINUX_BLOG_H */
> diff --git a/include/linux/blog/blog_batch.h b/include/linux/blog/blog_batch.h
> new file mode 100644
> index 000000000000..9752272a66fe
> --- /dev/null
> +++ b/include/linux/blog/blog_batch.h
> @@ -0,0 +1,54 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Binary Logging Batch Management
> + */
> +#ifndef _LINUX_BLOG_BATCH_H
> +#define _LINUX_BLOG_BATCH_H
> +
> +#include <linux/types.h>
> +#include <linux/percpu.h>
> +#include <linux/spinlock.h>
> +#include <linux/list.h>
> +
> +/* Size of each magazine (number of elements it can hold) */
> +#define BLOG_MAGAZINE_SIZE 16
> +
> +/* Structure representing a single magazine */
> +struct blog_magazine {
> +	struct list_head list;      /* For linking in global pools */
> +	unsigned int count;         /* Number of elements currently in magazine */
> +	void *elements[BLOG_MAGAZINE_SIZE];
> +};
> +
> +/* Per-CPU magazine state */
> +struct blog_cpu_magazine {
> +	struct blog_magazine *mag;  /* Current magazine for this CPU */
> +};
> +
> +/* Global magazine pools */
> +struct blog_batch {
> +	struct list_head full_magazines;   /* List of full magazines */
> +	struct list_head empty_magazines;  /* List of empty magazines */
> +	spinlock_t full_lock;              /* Protects full magazine list and count */
> +	spinlock_t empty_lock;             /* Protects empty magazine list and count */
> +	unsigned int nr_full;              /* Protected by full_lock */
> +	unsigned int nr_empty;             /* Protected by empty_lock */
> +	struct blog_cpu_magazine __percpu *cpu_magazines; /* Per-CPU magazines */
> +	struct kmem_cache *magazine_cache; /* Cache for magazine structures themselves */

Does it protected by some lock?

> +	bool external_cache;               /* True when magazine_cache owned by caller */
> +};

It looks like there is necessity to introduce auxiliary structure here:

struct magazines_list {
    spinlock_t lock;
    struct list_head magazines;
    unsigned int nr;
};

struct blog_batch {
    struct magazines_list full;
    struct magazines_list empty;
    struct blog_cpu_magazine __percpu *cpu_magazines;
    struct kmem_cache *magazine_cache;
    bool external_cache;
};

> +
> +/* Initialize the batching system */
> +int blog_batch_init(struct blog_batch *batch, struct kmem_cache *mag_cache,
> +		    bool prepopulate);
> +
> +/* Clean up the batching system */
> +void blog_batch_cleanup(struct blog_batch *batch);
> +
> +/* Get an element from the batch */
> +void *blog_batch_get(struct blog_batch *batch);
> +
> +/* Put an element back into the batch */
> +void blog_batch_put(struct blog_batch *batch, void *element);
> +
> +#endif /* _LINUX_BLOG_BATCH_H */
> diff --git a/include/linux/blog/blog_des.h b/include/linux/blog/blog_des.h
> new file mode 100644
> index 000000000000..a9076e89917c
> --- /dev/null
> +++ b/include/linux/blog/blog_des.h
> @@ -0,0 +1,46 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Binary Logging Deserialization
> + */
> +#ifndef _LINUX_BLOG_DES_H
> +#define _LINUX_BLOG_DES_H
> +
> +#include <linux/types.h> /* For size_t */
> +
> +/* Forward declarations */
> +struct blog_log_entry;
> +struct blog_logger;
> +
> +/**
> + * blog_des_reconstruct - Reconstructs a formatted string from serialized values
> + * @fmt: Format string containing % specifiers
> + * @buffer: Buffer containing serialized values
> + * @nr_args: Number of arguments to process
> + * @size: Size of the buffer in bytes
> + * @out: Buffer to store the reconstructed string
> + * @out_size: Size of the output buffer
> + *
> + * The function uses the format string to determine the types and number of values
> + * to extract from the buffer.
> + *
> + * Return: Number of bytes written to out buffer, or negative error code on failure
> + */
> +int blog_des_reconstruct(const char *fmt, const void *buffer, size_t nr_args,
> +			  size_t size, char *out, size_t out_size);
> +
> +/**
> + * blog_log_reconstruct - Reconstructs a formatted string from a log entry
> + * @entry: Log entry containing serialized data
> + * @output: Buffer to write the formatted string to
> + * @output_size: Size of the output buffer
> + *
> + * This is a wrapper around blog_des_reconstruct that handles log entry parsing.
> + * Note: This does NOT handle client_id - the caller should handle that separately
> + * using their module-specific callback.
> + *
> + * Return: Length of formatted string, or negative error code on failure
> + */
> +int blog_log_reconstruct(struct blog_logger *logger, const struct blog_log_entry *entry,
> +			char *output, size_t output_size);
> +
> +#endif /* _LINUX_BLOG_DES_H */
> diff --git a/include/linux/blog/blog_module.h b/include/linux/blog/blog_module.h
> new file mode 100644
> index 000000000000..06397204027b
> --- /dev/null
> +++ b/include/linux/blog/blog_module.h
> @@ -0,0 +1,329 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Binary Logging Infrastructure (BLOG) - Per-Module Support
> + *
> + * This header defines the per-module context support for BLOG.
> + * Each kernel module can have its own isolated logging context.
> + */
> +#ifndef _LINUX_BLOG_MODULE_H
> +#define _LINUX_BLOG_MODULE_H
> +
> +#include <linux/blog/blog.h>
> +
> +/**
> + * struct blog_module_context - Per-module BLOG state
> + * @name: Module name (max 31 chars + null terminator)
> + * @slot_id: Assigned slot ID (0-7) in task->blog_contexts array
> + * @logger: Logger instance for this module (isolated from other modules)
> + * @module_private: Opaque pointer for module-specific data
> + * @list: Linkage in global list of all module contexts
> + * @refcount: Reference count for module context lifecycle
> + * @allocated_contexts: Number of contexts currently allocated (includes freed but not yet recycled)
> + * @initialized: True after blog_module_init() succeeds
> + *
> + * Represents a registered BLOG module with its own isolated logger instance,
> + * slot ID for O(1) per-task context access, and independent source ID namespace.
> + * Created via blog_module_init() and destroyed via blog_module_cleanup().
> + */
> +struct blog_module_context {
> +	char name[32];
> +	u8 slot_id;
> +	struct blog_logger *logger;
> +	void *module_private;
> +	struct list_head list;
> +	atomic_t refcount;

Why not refcount_t?

> +	atomic_t allocated_contexts;
> +	bool initialized;
> +};
> +
> +/**
> + * struct blog_module_registry - Global registry of all BLOG modules
> + * @modules: Array of registered module contexts (max 8)
> + * @allocated_bitmap: Bitmap of allocated slot IDs (8 bits)
> + * @lock: Protects registration/unregistration operations
> + * @module_count: Number of currently registered modules
> + *
> + * Global singleton that tracks all registered BLOG modules and assigns
> + * slot IDs. Protected by spinlock for thread-safe registration.
> + */
> +struct blog_module_registry {
> +	struct blog_module_context *modules[BLOG_MAX_MODULES];
> +	u8 allocated_bitmap;
> +	spinlock_t lock;		/* protects module state */

Does it protect modules? If so, why it is not located before modules?

> +	atomic_t module_count;
> +};
> +
> +/* Module registration API */
> +
> +/**
> + * blog_module_register - Register a module and allocate a slot
> + * @module_name: Name of the module (max 31 chars)
> + *
> + * Registers a module in the global BLOG registry and assigns it a unique
> + * slot ID (0-7). The slot ID is used to index into each task's
> + * blog_contexts array for O(1) per-task context access.
> + *
> + * Only 8 modules can be registered simultaneously (BLOG_MAX_MODULES).
> + *
> + * Context: Process context (uses GFP_KERNEL allocation)
> + * Return: Module context on success, NULL if no slots available or invalid name
> + */
> +struct blog_module_context *blog_module_register(const char *module_name);
> +
> +/**
> + * blog_module_unregister - Unregister a module and free its slot
> + * @ctx: Module context to unregister
> + *
> + * Removes the module from the global registry and frees its slot for reuse.
> + * Must be called after blog_module_cleanup() to ensure all contexts are freed.
> + *
> + * Context: Process context
> + * Return: void
> + */
> +void blog_module_unregister(struct blog_module_context *ctx);
> +
> +/* Module context management API */
> +
> +/**
> + * blog_module_init - Initialize a per-module BLOG context
> + * @module_name: Name of the module (max 31 chars)
> + *
> + * Creates a complete isolated logging context for a kernel module, including:
> + * - Module registration and slot allocation
> + * - Logger instance with batching system
> + * - Source ID mapping (4096 sources)
> + * - Per-CPU NAPI context support
> + *
> + * This is the main entry point for modules that want to use BLOG.
> + *
> + * Context: Process context (uses GFP_KERNEL allocations)
> + * Return: Module context on success, NULL on failure
> + */
> +struct blog_module_context *blog_module_init(const char *module_name);
> +
> +/**
> + * blog_module_cleanup - Clean up a module's BLOG context
> + * @ctx: Module context to clean up
> + *
> + * Iterates through all tasks that have contexts for this module and
> + * detaches/frees them. Also cleans up batching system and per-CPU
> + * NAPI contexts. Should be called during module unload.
> + *
> + * Warning: This acquires task_lock for every task with a context, which
> + * can be slow if many tasks are using the module.
> + *
> + * Context: Process context
> + * Return: void
> + */
> +void blog_module_cleanup(struct blog_module_context *ctx);
> +
> +/**
> + * blog_module_get - Increment module context reference count
> + * @ctx: Module context
> + *
> + * Takes a reference on the module context to prevent it from being freed.
> + * Must be paired with blog_module_put().
> + *
> + * Context: Any context
> + * Return: void
> + */
> +void blog_module_get(struct blog_module_context *ctx);
> +
> +/**
> + * blog_module_put - Decrement module context reference count
> + * @ctx: Module context
> + *
> + * Releases a reference on the module context. When the last reference
> + * is dropped, the context is automatically cleaned up.
> + *
> + * Context: Any context
> + * Return: void
> + */
> +void blog_module_put(struct blog_module_context *ctx);
> +
> +/* Per-module API functions */
> +
> +/**
> + * blog_get_source_id_ctx - Get source ID for a module's log location
> + * @ctx: Module context
> + * @file: Source file name (typically kbasename(__FILE__))
> + * @func: Function name (typically __func__)
> + * @line: Line number (typically __LINE__)
> + * @fmt: Printf-style format string
> + *
> + * Per-module wrapper around blog_get_source_id(). Source IDs are
> + * module-local (different modules can have same source_id values).
> + *
> + * Context: Any context
> + * Return: Source ID for this module's logger, or 0 on error
> + */
> +u32 blog_get_source_id_ctx(struct blog_module_context *ctx, const char *file,
> +			   const char *func, unsigned int line, const char *fmt);
> +
> +/**
> + * blog_get_source_info_ctx - Get source info for a module-local source ID
> + * @ctx: Module context
> + * @id: Source ID to look up
> + *
> + * Per-module wrapper around blog_get_source_info().
> + *
> + * Context: Any context
> + * Return: Source info pointer, or NULL if invalid
> + */
> +struct blog_source_info *blog_get_source_info_ctx(struct blog_module_context *ctx, u32 id);
> +
> +/**
> + * blog_log_ctx - Log a message using module context
> + * @ctx: Module context
> + * @source_id: Source ID (from blog_get_source_id_ctx)
> + * @client_id: Module-specific client identifier
> + * @needed_size: Size in bytes for serialized arguments
> + *
> + * Per-module wrapper around blog_log(). Uses the module's slot ID to
> + * access the per-task context from task->blog_contexts[slot_id].
> + *
> + * Context: Process or softirq
> + * Return: Buffer pointer for serialization, or NULL on failure
> + */
> +void *blog_log_ctx(struct blog_module_context *ctx, u32 source_id,
> +			u8 client_id, size_t needed_size);
> +
> +/**
> + * blog_get_tls_ctx_ctx - Get or create per-task context for this module
> + * @ctx: Module context
> + *
> + * Gets the logging context for current task and this specific module.
> + * Uses slot-based access: task->blog_contexts[ctx->slot_id].
> + * Creates the context on first use (lazy allocation).
> + *
> + * Context: Process context only
> + * Return: TLS context pointer, or NULL on allocation failure
> + */
> +struct blog_tls_ctx *blog_get_tls_ctx_ctx(struct blog_module_context *ctx);
> +
> +/**
> + * blog_get_napi_ctx_ctx - Get NAPI context for this module
> + * @ctx: Module context
> + *
> + * Returns the NAPI (softirq) context for current CPU and this module.
> + *
> + * Context: Softirq context
> + * Return: NAPI context pointer, or NULL if not set
> + */
> +struct blog_tls_ctx *blog_get_napi_ctx_ctx(struct blog_module_context *ctx);
> +
> +/**
> + * blog_set_napi_ctx_ctx - Set NAPI context for this module
> + * @ctx: Module context
> + * @tls_ctx: Context to use for NAPI on current CPU
> + *
> + * Associates a context with current CPU for softirq logging.
> + *
> + * Context: Any context
> + * Return: void
> + */
> +void blog_set_napi_ctx_ctx(struct blog_module_context *ctx, struct blog_tls_ctx *tls_ctx);
> +
> +/**
> + * blog_get_ctx_ctx - Get appropriate context for this module
> + * @ctx: Module context
> + *
> + * Automatically selects NAPI or TLS context based on execution context.
> + * This is the recommended function for per-module context access.
> + *
> + * Context: Any context
> + * Return: Context pointer, or NULL on failure
> + */
> +struct blog_tls_ctx *blog_get_ctx_ctx(struct blog_module_context *ctx);
> +
> +/**
> + * blog_log_commit_ctx - Publish a reserved log entry
> + * @ctx: Module context
> + * @actual_size: Actual bytes written during serialization
> + *
> + * Publishes the log entry that was reserved by the last blog_log_ctx() call,
> + * making it visible to readers. Must be called after serialization is complete.
> + *
> + * The actual_size should be the number of bytes actually written, which may be
> + * less than the worst-case size reserved (e.g., strings may be shorter than 255).
> + *
> + * Context: Same context as the preceding blog_log_ctx() call
> + * Return: 0 on success, negative error code on failure
> + */
> +int blog_log_commit_ctx(struct blog_module_context *ctx, size_t actual_size);
> +
> +/*
> + * Per-module logging macros
> + *
> + * These macros provide the primary logging interface for modules using BLOG.
> + * They handle source ID caching, size calculation, serialization, and trimming
> + * automatically.
> + */
> +
> +/**
> + * BLOG_LOG_CTX - Log a message using module context (no client ID)
> + * @ctx: Module context from blog_module_init()
> + * @fmt: Printf-style format string
> + * @...: Arguments matching format string
> + *
> + * Primary logging macro for per-module BLOG usage. Automatically handles:
> + * - Source ID allocation and caching (static variable per call site)
> + * - Size calculation at compile time
> + * - Context acquisition (task or NAPI)
> + * - Serialization of arguments
> + * - Trimming of unused space
> + *
> + * Example:
> + *   BLOG_LOG_CTX(my_module_ctx, "Processing inode %llu size %zu\n",
> + *                inode_num, size);
> + *
> + * Context: Any context (automatically selects task or NAPI context)
> + */
> +#define BLOG_LOG_CTX(ctx, fmt, ...) \
> +	__BLOG_LOG_CTX(ctx, 0, 0, fmt, ##__VA_ARGS__)
> +
> +/**
> + * BLOG_LOG_CLIENT_CTX - Log a message with client identifier
> + * @ctx: Module context from blog_module_init()
> + * @client_id: Module-specific client identifier (e.g., connection ID)
> + * @fmt: Printf-style format string
> + * @...: Arguments matching format string
> + *
> + * Like BLOG_LOG_CTX but includes a client_id in the log entry. The client_id
> + * is module-specific and can be used to associate logs with specific clients,
> + * connections, or sessions.
> + *
> + * Example:
> + *   BLOG_LOG_CLIENT_CTX(ceph_ctx, ceph_client_id,
> + *                       "Cap update for inode %llu\n", inode);
> + *
> + * During deserialization, the module's client callback is invoked to
> + * format the client_id (e.g., "[fsid global_id]" prefix).
> + *
> + * Context: Any context (automatically selects task or NAPI context)
> + */
> +#define BLOG_LOG_CLIENT_CTX(ctx, client_id, fmt, ...) \
> +	__BLOG_LOG_CTX(ctx, 0, client_id, fmt, ##__VA_ARGS__)
> +
> +/* Internal implementation - do not use directly */
> +#define __BLOG_LOG_CTX(__ctx, dbg, __client_id, fmt, ...) \

Why this implementation is not declared before BLOG_LOG_CTX and
BLOG_LOG_CLIENT_CTX?

> +	do { \
> +		static u32 __source_id; \
> +		static size_t __size; \
> +		void *___buffer = NULL; \
> +		(void)(dbg); /* Reserved for future debug functionality */ \
> +		if (unlikely(__source_id == 0)) { \
> +			__source_id = blog_get_source_id_ctx(__ctx, \
> +				kbasename(__FILE__), __func__, __LINE__, fmt); \
> +			__size = blog_cnt(__VA_ARGS__); \
> +		} \
> +		___buffer = blog_log_ctx(__ctx, __source_id, __client_id, __size); \
> +		if (likely(___buffer)) {	\
> +			void *___tmp = ___buffer; \
> +			if (__size > 0) \
> +				blog_ser(___buffer, ##__VA_ARGS__);\
> +			blog_log_commit_ctx(__ctx, ___buffer - ___tmp); \
> +		} \
> +	} while (0)
> +
> +#endif /* _LINUX_BLOG_MODULE_H */
> diff --git a/include/linux/blog/blog_pagefrag.h b/include/linux/blog/blog_pagefrag.h
> new file mode 100644
> index 000000000000..f48c0264f13b
> --- /dev/null
> +++ b/include/linux/blog/blog_pagefrag.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Binary Logging Page Fragment Management
> + */
> +#ifndef _LINUX_BLOG_PAGEFRAG_H
> +#define _LINUX_BLOG_PAGEFRAG_H
> +
> +#include <linux/types.h>
> +#include <linux/mm.h>
> +#include <linux/spinlock.h>
> +
> +#define BLOG_PAGEFRAG_SIZE  (1<<19)  /* 512KB */

This is constant. Why do not declare it in more simple way? For example, (512 *
1024).

> +#define BLOG_PAGEFRAG_MASK (BLOG_PAGEFRAG_SIZE - 1)
> +
> +/* Pagefrag allocator structure */

Somehow there is no detailed comments here. :)

> +struct blog_pagefrag {
> +	struct page *pages;
> +	void *buffer;

Does pages and buffer require any lock?

> +	size_t capacity;        /* Buffer capacity for bounds checking */
> +	spinlock_t lock;        /* protects head */
> +	unsigned int head;
> +	unsigned int alloc_count;
> +	int active_elements;
> +	void *last_entry;       /* Pointer to the last allocated entry */
> +};
> +
> +int blog_pagefrag_init_with_buffer(struct blog_pagefrag *pf, void *buffer, size_t size);
> +int blog_pagefrag_reserve(struct blog_pagefrag *pf, unsigned int n);
> +void blog_pagefrag_publish(struct blog_pagefrag *pf, u64 publish_head);
> +void blog_pagefrag_reset(struct blog_pagefrag *pf);
> +void *blog_pagefrag_get_ptr(struct blog_pagefrag *pf, u64 val);
> +
> +#endif /* _LINUX_BLOG_PAGEFRAG_H */
> diff --git a/include/linux/blog/blog_ser.h b/include/linux/blog/blog_ser.h
> new file mode 100644
> index 000000000000..a492a1a08b53
> --- /dev/null
> +++ b/include/linux/blog/blog_ser.h
> @@ -0,0 +1,275 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Binary Logging Serialization
> + */
> +#ifndef _LINUX_BLOG_SER_H
> +#define _LINUX_BLOG_SER_H
> +
> +#include <linux/string.h>
> +#include <linux/kernel.h>
> +
> +#define IS_CONST_STR_PTR(t) \
> +	__builtin_types_compatible_p(typeof(t), const char *)
> +
> +#define IS_STR_PTR(t) \
> +	__builtin_types_compatible_p(typeof(t), char *)
> +
> +#define IS_STR(t) \
> +	(__builtin_types_compatible_p(typeof(t), const char *) || \
> +	__builtin_types_compatible_p(typeof(t), char *))
> +
> +#define __suppress_cast_warning(type, value) \
> +({ \
> +	_Pragma("GCC diagnostic push") \
> +	_Pragma("GCC diagnostic ignored \"-Wint-to-pointer-cast\"") \
> +	_Pragma("GCC diagnostic ignored \"-Wpointer-to-int-cast\"") \
> +	type __scw_result; \
> +	__scw_result = ((type)(value)); \
> +	_Pragma("GCC diagnostic pop") \
> +	__scw_result; \
> +})
> +
> +#define ___blog_concat(__a, __b) __a ## __b
> +#define ___blog_apply(__fn, __n) ___blog_concat(__fn, __n)
> +
> +#define ___blog_nth(_, __1, __2, __3, __4, __5, __6, __7, __8, __9, \
> +	__10, __11, __12, __13, __14, __15, __16, __17, __18, __19, __20, \
> +	__21, __22, __23, __24, __25, __26, __27, __28, __29, __30, __31, \
> +	__32, __N, ...) __N
> +#define ___blog_narg(...) ___blog_nth(_, ##__VA_ARGS__, \
> +	32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, \
> +	16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
> +#define blog_narg(...) ___blog_narg(__VA_ARGS__)
> +
> +#define STR_MAX_SIZE 255
> +#define __sizeof(x) \
> +	(IS_STR(x) ? STR_MAX_SIZE : \
> +	 (sizeof(x) < 4) ? 4 : sizeof(x))
> +
> +/* Size calculation macros */
> +#define ___blog_cnt0()		(0)
> +#define ___blog_cnt1(__t)		(__sizeof(__t))
> +#define ___blog_cnt2(__t, __args...)	(___blog_cnt1(__args) + __sizeof(__t))
> +#define ___blog_cnt3(__t, __args...)	(___blog_cnt2(__args) + __sizeof(__t))
> +#define ___blog_cnt4(__t, __args...)	(___blog_cnt3(__args) + __sizeof(__t))
> +#define ___blog_cnt5(__t, __args...)	(___blog_cnt4(__args) + __sizeof(__t))
> +#define ___blog_cnt6(__t, __args...)	(___blog_cnt5(__args) + __sizeof(__t))
> +#define ___blog_cnt7(__t, __args...)	(___blog_cnt6(__args) + __sizeof(__t))
> +#define ___blog_cnt8(__t, __args...)	(___blog_cnt7(__args) + __sizeof(__t))
> +#define ___blog_cnt9(__t, __args...)	(___blog_cnt8(__args) + __sizeof(__t))
> +#define ___blog_cnt10(__t, __args...)	(___blog_cnt9(__args) + __sizeof(__t))
> +#define ___blog_cnt11(__t, __args...)	(___blog_cnt10(__args) + __sizeof(__t))
> +#define ___blog_cnt12(__t, __args...)	(___blog_cnt11(__args) + __sizeof(__t))
> +#define ___blog_cnt13(__t, __args...)	(___blog_cnt12(__args) + __sizeof(__t))
> +#define ___blog_cnt14(__t, __args...)	(___blog_cnt13(__args) + __sizeof(__t))
> +#define ___blog_cnt15(__t, __args...)	(___blog_cnt14(__args) + __sizeof(__t))
> +#define ___blog_cnt16(__t, __args...)	(___blog_cnt15(__args) + __sizeof(__t))
> +#define ___blog_cnt17(__t, __args...)	(___blog_cnt16(__args) + __sizeof(__t))
> +#define ___blog_cnt18(__t, __args...)	(___blog_cnt17(__args) + __sizeof(__t))
> +#define ___blog_cnt19(__t, __args...)	(___blog_cnt18(__args) + __sizeof(__t))
> +#define ___blog_cnt20(__t, __args...)	(___blog_cnt19(__args) + __sizeof(__t))
> +#define ___blog_cnt21(__t, __args...)	(___blog_cnt20(__args) + __sizeof(__t))
> +#define ___blog_cnt22(__t, __args...)	(___blog_cnt21(__args) + __sizeof(__t))
> +#define ___blog_cnt23(__t, __args...)	(___blog_cnt22(__args) + __sizeof(__t))
> +#define ___blog_cnt24(__t, __args...)	(___blog_cnt23(__args) + __sizeof(__t))
> +#define ___blog_cnt25(__t, __args...)	(___blog_cnt24(__args) + __sizeof(__t))
> +#define ___blog_cnt26(__t, __args...)	(___blog_cnt25(__args) + __sizeof(__t))
> +#define ___blog_cnt27(__t, __args...)	(___blog_cnt26(__args) + __sizeof(__t))
> +#define ___blog_cnt28(__t, __args...)	(___blog_cnt27(__args) + __sizeof(__t))
> +#define ___blog_cnt29(__t, __args...)	(___blog_cnt28(__args) + __sizeof(__t))
> +#define ___blog_cnt30(__t, __args...)	(___blog_cnt29(__args) + __sizeof(__t))
> +#define ___blog_cnt31(__t, __args...)	(___blog_cnt30(__args) + __sizeof(__t))
> +#define ___blog_cnt32(__t, __args...)	(___blog_cnt31(__args) + __sizeof(__t))
> +#define blog_cnt(...)	 ___blog_apply(___blog_cnt, blog_narg(__VA_ARGS__))(__VA_ARGS__)
> +
> +#define IS_STR_ARRAY(t) \
> +	__builtin_types_compatible_p(typeof(t), char [])
> +
> +#define IS_DYNAMIC_CHAR_PTR(t) \
> +	(__builtin_classify_type((t)) == 14 && \
> +	 __builtin_types_compatible_p(typeof(t), char *) && \
> +	 !__builtin_constant_p((t)))
> +
> +#define IS_STATIC_CHAR_ARRAY(t) \
> +	(__builtin_classify_type((t)) == 5 && \
> +	 __builtin_types_compatible_p(typeof(t), char[]) && \
> +	 __builtin_constant_p((t)))
> +
> +#define IS_DYNAMIC_CHAR_ARRAY(t) \
> +	(__builtin_classify_type((t)) == 5 && \
> +	 __builtin_types_compatible_p(typeof(t), char[]) && \
> +	 !__builtin_constant_p((t)))
> +
> +#define char_ptr(str) __suppress_cast_warning(char *, (str))
> +
> +#ifndef _CEPH_BLOG_SER_HELPERS_DEFINED
> +#define _CEPH_BLOG_SER_HELPERS_DEFINED
> +
> +union null_str_u {
> +	char str[8];
> +	unsigned long force_align;
> +};
> +
> +static const union null_str_u null_str = {
> +	.str = "(NULL) \0"
> +};
> +
> +static inline size_t write_null_str(char *dst)
> +{
> +	*(union null_str_u *)dst = null_str;
> +	static_assert(sizeof(null_str.str) == sizeof(unsigned long),
> +		     "null_str.str size must match unsigned long for proper alignment");
> +	return __builtin_strlen(null_str.str);
> +}
> +
> +static inline size_t strscpy_n(char *dst, const char *src)

Do we really haven't standard function already?

> +{
> +	size_t count = 0;
> +
> +	while (count < STR_MAX_SIZE - 1) {
> +		dst[count] = src[count];
> +		if (src[count] == '\0')
> +			goto out;
> +		count++;
> +	}
> +
> +	dst[count] = '\0';
> +	pr_warn("blog_ser: string truncated, exceeded max size %d\n", STR_MAX_SIZE);
> +out:
> +	return count + 1;
> +}
> +
> +static inline ssize_t __strscpy(char *dst, const char *src)

Do we really haven't standard function already?

> +{
> +	if (src != NULL)
> +		return strscpy_n(dst, src);
> +	return write_null_str(dst);
> +}
> +
> +static inline void *strscpy_n_update(char *dst, const char *src, const char *file, int line)
> +{
> +	ssize_t ret = __strscpy(dst, src);
> +
> +	if (unlikely(ret <= 0 || ret >= STR_MAX_SIZE)) {
> +		pr_err("blog_ser: string handling error ret=%zd at %s:%d :: dst='%s' src='%s'\n",
> +		       ret, file, line, dst, src ? src : "(null)");
> +		/* Return safely instead of panicking - truncate and continue */
> +		if (ret >= STR_MAX_SIZE) {
> +			dst[STR_MAX_SIZE - 1] = '\0';
> +			ret = STR_MAX_SIZE;
> +		} else {
> +			/* Handle null or empty string case */
> +			dst[0] = '\0';
> +			ret = 1;
> +		}
> +	}
> +	return dst + round_up(ret, 4);
> +}
> +
> +#endif /* _CEPH_BLOG_SER_HELPERS_DEFINED */
> +
> +/* Serialization type macro */
> +#define __blog_ser_type(__buffer, __t)                          \
> +	(__builtin_choose_expr(IS_STATIC_CHAR_ARRAY((__t)),               \
> +		/* For static arrays (like __func__), copy inline */         \
> +		((__buffer) = (void *)strscpy_n_update((__buffer),            \
> +						char_ptr(__t),                             \
> +						kbasename(__FILE__),                      \
> +						__LINE__)),                               \
> +	__builtin_choose_expr(IS_STR((__t)),               \
> +		((__buffer) = (void *)strscpy_n_update((__buffer),            \
> +						char_ptr(__t),                             \
> +						kbasename(__FILE__),                      \
> +						__LINE__)),                               \
> +	__builtin_choose_expr(IS_STR_ARRAY((__t)),               \
> +		/* For dynamic arrays, save NULL and string bytes */         \
> +		 ((__buffer) = (void *)strscpy_n_update((__buffer),        \
> +						char_ptr(__t),                       \
> +						kbasename(__FILE__),                \
> +						__LINE__)),                          \
> +	__builtin_choose_expr(sizeof((__t)) == 1,                         \
> +		(*(uint32_t *)(__buffer) = __suppress_cast_warning(uint32_t, (__t)), \
> +		 (__buffer) = (void *)((char *)(__buffer) + 4)),            \
> +	__builtin_choose_expr(sizeof((__t)) == 2,                         \
> +		(*(uint32_t *)(__buffer) = __suppress_cast_warning(uint32_t, (__t)), \
> +		 (__buffer) = (void *)((char *)(__buffer) + 4)),            \
> +	__builtin_choose_expr(sizeof((__t)) == 4,                         \
> +		(*(uint32_t *)(__buffer) = __suppress_cast_warning(uint32_t, (__t)), \
> +		 (__buffer) = (void *)((char *)(__buffer) + 4)),            \
> +	__builtin_choose_expr(sizeof((__t)) == 8,                         \
> +		(*(uint64_t *)(__buffer) = __suppress_cast_warning(uint64_t, (__t)), \
> +		 (__buffer) = (void *)((char *)(__buffer) + 8)),            \
> +		(pr_err("UNSUPPORTED_TYPE: %s:%d: unsupported type size %zu\n",       \
> +			kbasename(__FILE__), __LINE__, sizeof(__t)))              \
> +	))))))))
> +
> +/* Serialization macros */
> +#define ___blog_ser0(__buffer)
> +#define ___blog_ser1(__buffer, __t)           (__blog_ser_type(__buffer, __t))
> +#define ___blog_ser2(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser1(__buffer, __args))
> +#define ___blog_ser3(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser2(__buffer, __args))
> +#define ___blog_ser4(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser3(__buffer, __args))
> +#define ___blog_ser5(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser4(__buffer, __args))
> +#define ___blog_ser6(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser5(__buffer, __args))
> +#define ___blog_ser7(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser6(__buffer, __args))
> +#define ___blog_ser8(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser7(__buffer, __args))
> +#define ___blog_ser9(__buffer, __t, __args...)    \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser8(__buffer, __args))
> +#define ___blog_ser10(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser9(__buffer, __args))
> +#define ___blog_ser11(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser10(__buffer, __args))
> +#define ___blog_ser12(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser11(__buffer, __args))
> +#define ___blog_ser13(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser12(__buffer, __args))
> +#define ___blog_ser14(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser13(__buffer, __args))
> +#define ___blog_ser15(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser14(__buffer, __args))
> +#define ___blog_ser16(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser15(__buffer, __args))
> +#define ___blog_ser17(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser16(__buffer, __args))
> +#define ___blog_ser18(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser17(__buffer, __args))
> +#define ___blog_ser19(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser18(__buffer, __args))
> +#define ___blog_ser20(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser19(__buffer, __args))
> +#define ___blog_ser21(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser20(__buffer, __args))
> +#define ___blog_ser22(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser21(__buffer, __args))
> +#define ___blog_ser23(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser22(__buffer, __args))
> +#define ___blog_ser24(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser23(__buffer, __args))
> +#define ___blog_ser25(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser24(__buffer, __args))
> +#define ___blog_ser26(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser25(__buffer, __args))
> +#define ___blog_ser27(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser26(__buffer, __args))
> +#define ___blog_ser28(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser27(__buffer, __args))
> +#define ___blog_ser29(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser28(__buffer, __args))
> +#define ___blog_ser30(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser29(__buffer, __args))
> +#define ___blog_ser31(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser30(__buffer, __args))
> +#define ___blog_ser32(__buffer, __t, __args...)   \
> +	(__blog_ser_type(__buffer, __t), ___blog_ser31(__buffer, __args))
> +#define ___blog_ser(__buffer, ...)   \
> +	___blog_apply(___blog_ser, blog_narg(__VA_ARGS__))(__buffer, ##__VA_ARGS__)
> +#define blog_ser(...)  ___blog_ser(__VA_ARGS__)
> +
> +#endif /* _LINUX_BLOG_SER_H */
> diff --git a/lib/Kconfig b/lib/Kconfig
> index e629449dd2a3..27262af07453 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -512,6 +512,8 @@ source "lib/vdso/Kconfig"
>  
>  source "lib/fonts/Kconfig"
>  
> +source "lib/blog/Kconfig"
> +
>  config SG_SPLIT
>  	def_bool n
>  	help
> diff --git a/lib/Makefile b/lib/Makefile
> index 1ab2c4be3b66..913c5790cdd3 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -283,6 +283,8 @@ obj-$(CONFIG_ASN1_ENCODER) += asn1_encoder.o
>  
>  obj-$(CONFIG_FONT_SUPPORT) += fonts/
>  
> +obj-$(CONFIG_BLOG) += blog/
> +
>  #
>  # Build a fast OID lookip registry from include/linux/oid_registry.h
>  #
> diff --git a/lib/blog/Kconfig b/lib/blog/Kconfig
> new file mode 100644
> index 000000000000..3207934f51b8
> --- /dev/null
> +++ b/lib/blog/Kconfig
> @@ -0,0 +1,56 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Binary Logging Infrastructure (BLOG)
> +#
> +
> +config BLOG
> +	tristate "Binary Logging Infrastructure"
> +	help
> +	  Generic binary logging infrastructure for kernel subsystems.
> +	  Provides efficient batched logging with binary serialization
> +	  and deserialization support. Modules using BLOG maintain their
> +	  own client mappings and debugfs interfaces.
> +
> +	  If unsure, say N.
> +
> +config BLOG_DEBUG
> +	bool "Binary Logging Debug Support"
> +	depends on BLOG
> +	default n
> +	help
> +	  Enable debug features for the binary logging infrastructure,
> +	  including memory poisoning, validation checks, and usage tracking.
> +	  This adds overhead and should only be enabled for debugging.
> +
> +	  If unsure, say N.
> +
> +config BLOG_MAX_CLIENTS
> +	int "Maximum number of logging clients"
> +	depends on BLOG
> +	range 16 1024
> +	default 256
> +	help
> +	  Maximum number of client IDs that can be used by modules.
> +	  Each module using BLOG can register multiple clients up to
> +	  this limit. The client ID is stored as u8, so maximum is 256.
> +
> +config BLOG_MAX_SOURCES
> +	int "Maximum number of source locations"
> +	depends on BLOG
> +	range 256 16384
> +	default 4096
> +	help
> +	  Maximum number of unique source code locations (file/function/line)
> +	  that can be tracked. Each unique logging call site gets a source ID.
> +
> +config BLOG_TRACK_USAGE
> +	bool "Track usage statistics"
> +	depends on BLOG_DEBUG
> +	default n
> +	help
> +	  Track usage statistics for logging operations, including counts
> +	  and bytes used in different contexts (task vs NAPI). This helps
> +	  identify performance bottlenecks and usage patterns in the logging
> +	  system but adds overhead to each logging operation.
> +
> +	  If unsure, say N.
> diff --git a/lib/blog/Makefile b/lib/blog/Makefile
> new file mode 100644
> index 000000000000..8c33a2a6e9d5
> --- /dev/null
> +++ b/lib/blog/Makefile
> @@ -0,0 +1,15 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for Binary Logging Infrastructure (BLOG)
> +#
> +
> +obj-$(CONFIG_BLOG) += blog.o
> +
> +blog-y := blog_core.o blog_batch.o blog_pagefrag.o blog_des.o blog_module.o
> +
> +# Debug support
> +# blog-$(CONFIG_BLOG_DEBUG) += blog_debug.o
> +
> +# Compiler flags
> +ccflags-$(CONFIG_BLOG_DEBUG) += -DBLOG_DEBUG=1
> +ccflags-$(CONFIG_BLOG_TRACK_USAGE) += -DBLOG_TRACK_USAGE=1
> diff --git a/lib/blog/blog_batch.c b/lib/blog/blog_batch.c
> new file mode 100644
> index 000000000000..beb512836bbe
> --- /dev/null
> +++ b/lib/blog/blog_batch.c
> @@ -0,0 +1,311 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Binary Logging Batch Management
> + *
> + * Migrated from ceph_san_batch.c with all algorithms preserved
> + * Implements per-CPU magazine-based batching for efficient object recycling
> + */
> +
> +#include <linux/slab.h>
> +#include <linux/module.h>
> +#include <linux/percpu.h>
> +#include <linux/preempt.h>
> +#include <linux/spinlock.h>
> +#include <linux/list.h>
> +#include <linux/blog/blog_batch.h>
> +#include <linux/blog/blog.h>
> +
> +/* Number of magazines to preallocate during initialization */
> +#define BLOG_INIT_MAGAZINES 4
> +
> +static struct blog_magazine *alloc_magazine(struct blog_batch *batch)
> +{
> +	struct blog_magazine *mag;
> +
> +	/* Allocate magazine structure from cache */
> +	mag = kmem_cache_zalloc(batch->magazine_cache, GFP_KERNEL);
> +	if (!mag)
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&mag->list);
> +	mag->count = 0;
> +	return mag;
> +}
> +
> +static void free_magazine(struct blog_batch *batch, struct blog_magazine *mag)
> +{
> +	int i;
> +	struct blog_tls_pagefrag *composite;
> +
> +	/* Free all composites in this magazine before freeing magazine itself */
> +	for (i = 0; i < mag->count; i++) {
> +		composite = mag->elements[i];
> +		if (composite) {
> +			/* Composites are allocated with alloc_pages(), free with __free_pages() */
> +			__free_pages(virt_to_page(composite),
> +				     get_order(BLOG_TLS_PAGEFRAG_ALLOC_SIZE));
> +		}
> +	}
> +
> +	/* Free the magazine structure itself */
> +	kmem_cache_free(batch->magazine_cache, mag);
> +}
> +
> +/**
> + * blog_batch_init - Initialize the batching system
> + * @batch: Batch structure to initialize
> + * @prepopulate: If true, pre-populate magazines with composites
> + *
> + * Allocates and initializes the per-CPU magazines and global pools.
> + * Magazine structures are allocated from a slab cache.
> + * Composites (512KB each) are allocated via alloc_pages().
> + *
> + * If prepopulate is true (typically for alloc_batch), pre-allocates
> + * magazines filled with composites. log_batch should start empty.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int blog_batch_init(struct blog_batch *batch, struct kmem_cache *mag_cache,
> +		    bool prepopulate)
> +{
> +	int cpu, i, j;
> +	struct blog_cpu_magazine *cpu_mag;
> +	struct blog_magazine *mag;
> +	struct blog_tls_pagefrag *composite;
> +	struct page *pages;
> +
> +	/* Initialize counters */
> +	batch->nr_full = 0;
> +	batch->nr_empty = 0;
> +
> +	/* Use caller-provided cache or create one */
> +	if (mag_cache) {
> +		batch->magazine_cache = mag_cache;
> +		batch->external_cache = true;
> +	} else {
> +		batch->magazine_cache = kmem_cache_create("blog_magazine",
> +						       sizeof(struct blog_magazine),
> +						       0, SLAB_HWCACHE_ALIGN, NULL);
> +		if (!batch->magazine_cache)
> +			return -ENOMEM;
> +		batch->external_cache = false;
> +	}
> +
> +	/* Initialize global magazine lists */
> +	INIT_LIST_HEAD(&batch->full_magazines);
> +	INIT_LIST_HEAD(&batch->empty_magazines);
> +	spin_lock_init(&batch->full_lock);
> +	spin_lock_init(&batch->empty_lock);
> +
> +	/* Allocate per-CPU magazines */
> +	batch->cpu_magazines = alloc_percpu(struct blog_cpu_magazine);
> +	if (!batch->cpu_magazines)
> +		goto cleanup_cache;
> +
> +	/* Initialize per-CPU magazines to NULL (magazines allocated on-demand) */
> +	for_each_possible_cpu(cpu) {
> +		cpu_mag = per_cpu_ptr(batch->cpu_magazines, cpu);
> +		cpu_mag->mag = NULL;
> +	}
> +
> +	/* Pre-populate magazines if requested (typically for alloc_batch) */
> +	if (prepopulate) {
> +		for (i = 0; i < BLOG_INIT_MAGAZINES; i++) {
> +			mag = alloc_magazine(batch);
> +			if (!mag)
> +				goto cleanup;
> +
> +			/* Populate magazine with composites allocated via alloc_pages() */
> +			for (j = 0; j < BLOG_MAGAZINE_SIZE; j++) {
> +				/* Allocate 512KB composite via page allocator */
> +				pages = alloc_pages(GFP_KERNEL | __GFP_ZERO,
> +						    get_order(BLOG_TLS_PAGEFRAG_ALLOC_SIZE));
> +				if (!pages) {
> +					/* Free partially filled magazine */
> +					free_magazine(batch, mag);
> +					goto cleanup;
> +				}
> +				composite = page_address(pages);
> +				/* Composite is zeroed - ctx->id == 0, will be initialized on first use */
> +				mag->elements[j] = composite;
> +				mag->count++;
> +			}
> +
> +			/* Add full magazine to the full pool */
> +			spin_lock(&batch->full_lock);
> +			list_add(&mag->list, &batch->full_magazines);
> +			batch->nr_full++;
> +			spin_unlock(&batch->full_lock);
> +		}
> +	}
> +
> +	return 0;
> +
> +cleanup:
> +	blog_batch_cleanup(batch);
> +	return -ENOMEM;
> +
> +cleanup_cache:
> +	if (!batch->external_cache && batch->magazine_cache)
> +		kmem_cache_destroy(batch->magazine_cache);
> +	return -ENOMEM;
> +}
> +EXPORT_SYMBOL(blog_batch_init);
> +
> +/**
> + * blog_batch_cleanup - Clean up the batching system
> + * @batch: Batch structure to clean up
> + *
> + * Frees all magazines and composites, and destroys the magazine cache.
> + */
> +void blog_batch_cleanup(struct blog_batch *batch)
> +{
> +	int cpu;
> +	struct blog_magazine *mag, *tmp;
> +	struct blog_cpu_magazine *cpu_mag;
> +
> +	/* Free per-CPU magazines */
> +	if (batch->cpu_magazines) {
> +		for_each_possible_cpu(cpu) {
> +			cpu_mag = per_cpu_ptr(batch->cpu_magazines, cpu);
> +			if (cpu_mag->mag)
> +				free_magazine(batch, cpu_mag->mag);
> +		}
> +		free_percpu(batch->cpu_magazines);
> +	}
> +
> +	/* Free magazines in the full pool */
> +	spin_lock(&batch->full_lock);
> +	list_for_each_entry_safe(mag, tmp, &batch->full_magazines, list) {
> +		list_del(&mag->list);
> +		batch->nr_full--;
> +		free_magazine(batch, mag);
> +	}
> +	spin_unlock(&batch->full_lock);
> +
> +	/* Free magazines in the empty pool */
> +	spin_lock(&batch->empty_lock);
> +	list_for_each_entry_safe(mag, tmp, &batch->empty_magazines, list) {
> +		list_del(&mag->list);
> +		batch->nr_empty--;
> +		free_magazine(batch, mag);
> +	}
> +	spin_unlock(&batch->empty_lock);
> +
> +	/* Destroy magazine cache */
> +	if (!batch->external_cache && batch->magazine_cache)
> +		kmem_cache_destroy(batch->magazine_cache);
> +
> +	batch->magazine_cache = NULL;
> +	batch->external_cache = false;
> +}
> +EXPORT_SYMBOL(blog_batch_cleanup);
> +
> +/**
> + * blog_batch_get - Get an element from the batch
> + * @batch: Batch to get element from
> + *
> + * Return: Element from the magazine, or NULL if none available
> + */
> +void *blog_batch_get(struct blog_batch *batch)
> +{
> +	struct blog_cpu_magazine *cpu_mag;
> +	struct blog_magazine *old_mag, *new_mag;
> +	void *element = NULL;
> +
> +	preempt_disable();
> +	cpu_mag = this_cpu_ptr(batch->cpu_magazines);
> +
> +	/* If we have a magazine and it has elements, use it */
> +	if (cpu_mag->mag && cpu_mag->mag->count > 0) {
> +		element = cpu_mag->mag->elements[--cpu_mag->mag->count];
> +		goto out;
> +	}
> +
> +	/* Current magazine is empty, try to get a full one */
> +	old_mag = cpu_mag->mag;
> +
> +	/* Return old magazine to empty pool if we have one */
> +	if (old_mag) {
> +		spin_lock(&batch->empty_lock);
> +		list_add(&old_mag->list, &batch->empty_magazines);
> +		batch->nr_empty++;
> +		spin_unlock(&batch->empty_lock);
> +		cpu_mag->mag = NULL;
> +	}
> +
> +	if (batch->nr_full > 0) {
> +		/* Try to get a full magazine */
> +		spin_lock(&batch->full_lock);
> +		if (!list_empty(&batch->full_magazines)) {
> +			new_mag = list_first_entry(&batch->full_magazines,
> +						   struct blog_magazine, list);
> +			list_del(&new_mag->list);
> +			batch->nr_full--;
> +			spin_unlock(&batch->full_lock);
> +
> +			cpu_mag->mag = new_mag;
> +			if (new_mag->count > 0)
> +				element = new_mag->elements[--new_mag->count];
> +		} else {
> +			spin_unlock(&batch->full_lock);
> +		}
> +	}
> +out:
> +	preempt_enable();
> +	return element;
> +}
> +EXPORT_SYMBOL(blog_batch_get);
> +
> +/**
> + * blog_batch_put - Put an element back into the batch
> + * @batch: Batch to put element into
> + * @element: Element to put back
> + */
> +void blog_batch_put(struct blog_batch *batch, void *element)
> +{
> +	struct blog_cpu_magazine *cpu_mag;
> +	struct blog_magazine *mag;
> +
> +	preempt_disable();
> +	cpu_mag = this_cpu_ptr(batch->cpu_magazines);
> +
> +	/* Optimistically try to add to current magazine */
> +	if (likely(cpu_mag->mag && cpu_mag->mag->count < BLOG_MAGAZINE_SIZE)) {
> +		cpu_mag->mag->elements[cpu_mag->mag->count++] = element;
> +		goto out;
> +	}
> +
> +	/* If current magazine is full, move it to full pool */
> +	if (likely(cpu_mag->mag && cpu_mag->mag->count >= BLOG_MAGAZINE_SIZE)) {
> +		spin_lock(&batch->full_lock);
> +		list_add_tail(&cpu_mag->mag->list, &batch->full_magazines);
> +		batch->nr_full++;
> +		spin_unlock(&batch->full_lock);
> +		cpu_mag->mag = NULL;
> +	}
> +
> +	/* Get new magazine if needed */
> +	if (likely(!cpu_mag->mag)) {
> +		/* Try to get from empty pool first */
> +		spin_lock(&batch->empty_lock);
> +		if (!list_empty(&batch->empty_magazines)) {
> +			mag = list_first_entry(&batch->empty_magazines,
> +					       struct blog_magazine, list);
> +			list_del(&mag->list);
> +			batch->nr_empty--;
> +			spin_unlock(&batch->empty_lock);
> +			cpu_mag->mag = mag;
> +		} else {
> +			spin_unlock(&batch->empty_lock);
> +			cpu_mag->mag = alloc_magazine(batch);
> +		}
> +
> +		if (unlikely(!cpu_mag->mag))
> +			goto out;
> +	}
> +	/* Add element to magazine */
> +	cpu_mag->mag->elements[cpu_mag->mag->count++] = element;
> +out:
> +	preempt_enable();
> +}
> diff --git a/lib/blog/blog_core.c b/lib/blog/blog_core.c
> new file mode 100644
> index 000000000000..ede382f0ce5e
> --- /dev/null
> +++ b/lib/blog/blog_core.c
> @@ -0,0 +1,772 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Binary Logging Infrastructure - Core Implementation
> + *
> + * Migrated from ceph_san_logger.c with algorithms preserved
> + * Client ID management removed - modules handle their own mappings
> + */
> +
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/string.h>
> +#include <linux/printk.h>
> +#include <linux/time.h>
> +#include <linux/percpu.h>
> +#include <linux/spinlock.h>
> +#include <linux/list.h>
> +#include <linux/sched.h>
> +#include <linux/sched/task.h>
> +#include <linux/seq_file.h>
> +#include <linux/atomic.h>
> +
> +#include <linux/blog/blog.h>
> +#include <linux/blog/blog_batch.h>
> +#include <linux/blog/blog_pagefrag.h>
> +#include <linux/blog/blog_ser.h>
> +#include <linux/blog/blog_des.h>
> +#include <linux/blog/blog_module.h>
> +
> +static void blog_tls_release_verbose(void *ptr);
> +#define NULL_STR "(NULL)"
> +#define BLOG_LOG_BATCH_MAX_FULL 16
> +
> +/* Core BLOG functions - all require a valid logger parameter */
> +
> +/**
> + * blog_is_valid_kernel_addr - Check if address is in valid kernel address range
> + * @addr: Address to check
> + *
> + * Returns true if address is in valid kernel address range
> + */
> +bool blog_is_valid_kernel_addr(const void *addr)
> +{
> +	if (virt_addr_valid(addr))
> +		return true;
> +	return false;
> +}
> +EXPORT_SYMBOL(blog_is_valid_kernel_addr);
> +
> +/**
> + * get_context_id - Get a unique context ID
> + * @logger: Logger instance to use
> + *
> + * Acquires a unique ID for a TLS context using the logger's counter
> + *
> + * Returns a unique context ID
> + */
> +static u64 get_context_id(struct blog_logger *logger)
> +{
> +	u64 id;
> +
> +	spin_lock(&logger->ctx_id_lock);
> +	id = logger->next_ctx_id++;
> +	spin_unlock(&logger->ctx_id_lock);
> +	return id;
> +}
> +
> +/**
> + * validate_tls_ctx - Validate a TLS context
> + * @ctx: Context to validate
> + *
> + * Returns true if context is valid, false otherwise
> + */
> +static inline bool validate_tls_ctx(struct blog_tls_ctx *ctx)
> +{
> +	if (!ctx)
> +		return false;
> +
> +#if BLOG_DEBUG_POISON
> +	if (ctx->debug_poison != BLOG_CTX_POISON) {
> +		pr_err("BUG: TLS context id=%llu (%llx) has invalid debug_poison value 0x%llx\n",
> +		       ctx->id, (unsigned long long)ctx,
> +		       (unsigned long long)ctx->debug_poison);
> +		return false;
> +	}
> +#endif
> +
> +	if (atomic_read(&ctx->refcount) != 1) {
> +		pr_err("BUG: TLS context id=%llu (%llx) refcount %d, expected 1\n",
> +		       ctx->id, (unsigned long long)ctx,
> +		       atomic_read(&ctx->refcount));
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +/**
> + * add_context_to_global_list - Add a context to the global list
> + * @ctx: The context to add to the global list
> + *
> + * Adds the context to the global list of contexts and updates stats
> + */
> +static void add_context_to_logger_list(struct blog_logger *logger, struct blog_tls_ctx *ctx)
> +{
> +	if (!logger)
> +		return;
> +
> +	spin_lock(&logger->lock);
> +	list_add(&ctx->list, &logger->contexts);
> +	logger->total_contexts_allocated++;
> +	spin_unlock(&logger->lock);
> +}
> +
> +static void remove_context_from_logger_list(struct blog_logger *logger,
> +					      struct blog_tls_ctx *ctx)
> +{
> +	if (!logger)
> +		return;
> +
> +	spin_lock(&logger->lock);
> +	if (!list_empty(&ctx->list)) {
> +		list_del_init(&ctx->list);
> +		if (logger->total_contexts_allocated)
> +			logger->total_contexts_allocated--;
> +	}
> +	spin_unlock(&logger->lock);
> +}
> +
> +static void blog_tls_clear_task(struct blog_tls_ctx *ctx)
> +{
> +	struct task_struct *task;
> +
> +	if (!ctx)
> +		return;
> +
> +	task = READ_ONCE(ctx->task);
> +	if (task) {
> +		WRITE_ONCE(ctx->task, NULL);
> +		put_task_struct(task);
> +	}
> +}
> +
> +static inline struct blog_tls_ctx *get_new_ctx(struct blog_logger *logger)
> +{
> +	struct blog_tls_pagefrag *composite;
> +	struct blog_tls_ctx *ctx;
> +	struct blog_pagefrag *pf;
> +	bool is_new = false;
> +
> +	if (!logger)
> +		return NULL;
> +
> +	/* Pop a composite from the alloc_batch */
> +	composite = blog_batch_get(&logger->alloc_batch);
> +	if (!composite) {
> +		/* If batch is empty, allocate a new composite via page allocator (512KB) */
> +		struct page *pages;
> +
> +		pages = alloc_pages(GFP_KERNEL | __GFP_ZERO,
> +				    get_order(BLOG_TLS_PAGEFRAG_ALLOC_SIZE));
> +		if (!pages) {
> +			pr_err("%s: Failed to allocate composite via alloc_pages\n", __func__);
> +			return NULL;
> +		}
> +		composite = page_address(pages);
> +		is_new = true;
> +	}
> +
> +	ctx = &composite->ctx;
> +
> +	/* Check if this composite needs initialization (new or first use from batch) */
> +	if (is_new || ctx->id == 0) {
> +		/* Initialize context fields for new composite or uninitialized one */
> +		INIT_LIST_HEAD(&ctx->list);
> +		atomic_set(&ctx->refcount, 0);
> +		ctx->id = get_context_id(logger);
> +
> +#if BLOG_DEBUG_POISON
> +		ctx->debug_poison = BLOG_CTX_POISON;
> +#endif
> +
> +		pr_debug("[%d]blog: initialized new composite id=%llu (%llx)\n",
> +			 smp_processor_id(), ctx->id, (unsigned long long)ctx);
> +	}
> +
> +	/* Always refresh these fields on reuse */
> +	ctx->logger = logger;
> +	ctx->release = blog_tls_release_verbose;
> +
> +#if BLOG_DEBUG_POISON
> +	/* Verify debug poison */
> +	if (ctx->debug_poison != BLOG_CTX_POISON) {
> +		pr_err("%s: Context id=%llu has invalid debug_poison value 0x%llx\n",
> +		       __func__, ctx->id, (unsigned long long)ctx->debug_poison);
> +		WARN_ON_ONCE(1);
> +		return NULL;
> +	}
> +#endif
> +
> +	/* Initialize the embedded pagefrag to point to inline buffer */
> +	pf = &composite->pf;
> +	pf->pages = NULL;  /* No separate pages, using inline buffer */
> +	pf->buffer = composite->buf;  /* Point to inline buffer */
> +	pf->capacity = BLOG_TLS_PAGEFRAG_BUFFER_SIZE;  /* Inline buffer size (512KB - header) */
> +	spin_lock_init(&pf->lock);
> +	pf->head = 0;
> +	pf->alloc_count = 0;
> +	pf->active_elements = 0;
> +	pf->last_entry = NULL;
> +
> +	/* Debug: Write buffer offset markers */
> +	{
> +		size_t expected_offset = sizeof(struct blog_tls_ctx) + sizeof(struct blog_pagefrag);
> +		size_t actual_offset = (char *)composite->buf - (char *)composite;
> +		*(u64 *)composite->buf = expected_offset;
> +		pr_err("blog: composite=%p, ctx=%p, pf=%p, buf=%p, expected_offset=%zu, actual_offset=%zu\n",
> +			composite, &composite->ctx, &composite->pf, composite->buf,
> +			expected_offset, actual_offset);
> +	}
> +
> +	/* Add to logger's context list */
> +	add_context_to_logger_list(logger, ctx);
> +
> +	/* Set base timestamp */
> +	ctx->base_jiffies = jiffies;
> +
> +	blog_logger_print_stats(logger);
> +	return ctx; /* Context returned with refcount = 0 */
> +}
> +
> +/**
> + * is_valid_active_ctx - Validate an active TLS context
> + * @ctx: Context to validate
> + * @context_description: String describing the context for error messages
> + *
> + * Returns true if context is valid (poison OK, refcount == 1), false otherwise
> + */
> +static inline bool is_valid_active_ctx(struct blog_tls_ctx *ctx,
> +				       const char *context_description)
> +{
> +	if (!ctx) {
> +		pr_err("BUG: %s context is NULL.\n", context_description);
> +		return false;
> +	}
> +
> +#if BLOG_DEBUG_POISON
> +	if (ctx->debug_poison != BLOG_CTX_POISON) {
> +		pr_err("BUG: %s context id=%llu (%llx) has invalid debug_poison value 0x%llx\n",
> +		       context_description, ctx->id, (unsigned long long)ctx,
> +		       (unsigned long long)ctx->debug_poison);
> +		return false;
> +	}
> +#endif
> +
> +	if (atomic_read(&ctx->refcount) != 1) {
> +		pr_err("BUG: %s context id=%llu (%llx) refcount %d, expected 1\n",
> +		       context_description, ctx->id, (unsigned long long)ctx,
> +		       atomic_read(&ctx->refcount));
> +		return false;
> +	}
> +	return true;
> +}
> +
> +/* Release function for TLS storage */
> +static void blog_tls_release(void *ptr)
> +{
> +	struct blog_tls_ctx *ctx = ptr;
> +	struct blog_tls_pagefrag *composite;
> +
> +	if (!ctx)
> +		return;
> +
> +	if (atomic_dec_return(&ctx->refcount) != 0) {
> +		pr_err("BUG: TLS context id=%llu refcount %d after release\n",
> +		       ctx->id, atomic_read(&ctx->refcount));
> +		panic("blog: TLS context id=%llu refcount %d after release\n",
> +		      ctx->id, atomic_read(&ctx->refcount));
> +	}
> +	pr_debug("blog: decremented refcount=0 for context id=%llu\n", ctx->id);
> +
> +	/* Clear task association */
> +	blog_tls_clear_task(ctx);
> +	pr_debug("blog: releasing TLS context for pid %d [%s]\n", ctx->pid,
> +		 ctx->comm);
> +
> +	if (ctx->logger) {
> +		/* Remove from logger's context list */
> +		remove_context_from_logger_list(ctx->logger, ctx);
> +
> +		/* Recycle composite to log_batch - logs remain readable!
> +		 * Composite will flow: log_batch → drain → reset → alloc_batch */
> +		composite = blog_ctx_container(ctx);
> +		blog_batch_put(&ctx->logger->log_batch, composite);
> +
> +		/* If log_batch has too many full magazines, move one to alloc_batch */
> +		if (ctx->logger->log_batch.nr_full > BLOG_LOG_BATCH_MAX_FULL) {
> +			struct blog_magazine *mag;
> +
> +			spin_lock(&ctx->logger->log_batch.full_lock);
> +			if (!list_empty(&ctx->logger->log_batch.full_magazines)) {
> +				mag = list_first_entry(
> +					&ctx->logger->log_batch.full_magazines,
> +					struct blog_magazine, list);
> +				list_del(&mag->list);
> +				ctx->logger->log_batch.nr_full--;
> +				spin_unlock(&ctx->logger->log_batch.full_lock);
> +
> +				spin_lock(&ctx->logger->alloc_batch.full_lock);
> +				list_add(&mag->list,
> +					 &ctx->logger->alloc_batch.full_magazines);
> +				ctx->logger->alloc_batch.nr_full++;
> +				spin_unlock(&ctx->logger->alloc_batch.full_lock);
> +			} else {
> +				spin_unlock(&ctx->logger->log_batch.full_lock);
> +			}
> +		}
> +	} else {
> +		pr_err("BUG: TLS context id=%llu has no logger for batch release\n",
> +		       ctx->id);
> +	}
> +}
> +
> +static void blog_tls_release_verbose(void *ptr)
> +{
> +	struct blog_tls_ctx *ctx = (struct blog_tls_ctx *)ptr;
> +
> +	if (!ctx) {
> +		pr_err("blog -- Callback : invalid TLS context pointer %d\n",
> +		       current->pid);
> +		return;
> +	}
> +#if BLOG_DEBUG_POISON
> +	if (ctx->debug_poison != BLOG_CTX_POISON) {
> +		pr_err("%s: invalid TLS context id=%llu has invalid debug_poison value 0x%llx\n",
> +		       __func__, ctx->id, (unsigned long long)ctx->debug_poison);
> +		WARN_ON_ONCE(1);
> +		return;
> +	}
> +#endif
> +	if (atomic_read(&ctx->refcount) != 1) {
> +		pr_err("%s: invalid TLS context refcount %d for pid %d [%s]\n",
> +		       __func__, atomic_read(&ctx->refcount), ctx->pid, ctx->comm);
> +		WARN_ON_ONCE(1);
> +		return;
> +	}
> +	blog_tls_release(ctx);
> +}
> +
> +static struct blog_tls_ctx *lookup_active_ctx(struct blog_logger *logger)
> +{
> +	struct blog_tls_ctx *ctx;
> +
> +	if (!logger)
> +		return NULL;
> +
> +	if (logger->has_slot) {
> +		u8 slot = logger->slot_id;
> +		struct blog_tls_ctx *slot_ctx;
> +
> +		if (slot < BLOG_MAX_MODULES) {
> +			slot_ctx = READ_ONCE(current->blog_contexts[slot]);
> +			if (slot_ctx)
> +				return slot_ctx;
> +		}
> +	}
> +
> +	spin_lock(&logger->lock);
> +	list_for_each_entry(ctx, &logger->contexts, list) {
> +		if (ctx->task == current) {
> +			spin_unlock(&logger->lock);
> +			return ctx;
> +		}
> +	}
> +	spin_unlock(&logger->lock);
> +	return NULL;
> +}
> +
> +/**
> + * blog_get_tls_ctx - Get or create TLS context for current task
> + * @logger: Logger instance to use
> + *
> + * Returns pointer to TLS context or NULL on error
> + */
> +struct blog_tls_ctx *blog_get_tls_ctx(struct blog_logger *logger)
> +{
> +	struct blog_tls_ctx *ctx = lookup_active_ctx(logger);
> +
> +	/* Context already exists - handled by slot-based system */
> +	if (ctx)
> +		return ctx;
> +
> +	if (logger && logger->has_slot && logger->owner_ctx)
> +		return blog_get_tls_ctx_ctx(logger->owner_ctx);
> +
> +	/* Create new context */
> +	pr_debug("blog: creating new TLS context for pid %d [%s]\n",
> +		 current->pid, current->comm);
> +
> +	ctx = get_new_ctx(logger);
> +	if (!ctx)
> +		return NULL;
> +
> +	blog_tls_clear_task(ctx);
> +	get_task_struct(current);
> +
> +	/* Set up TLS specific parts */
> +	/* Note: slot-based storage is handled by blog_module layer */
> +	ctx->task = current;
> +	ctx->pid = current->pid;
> +	strscpy(ctx->comm, current->comm, TASK_COMM_LEN);
> +
> +	/* Increment refcount from 0 to 1 */
> +	if (atomic_inc_return(&ctx->refcount) != 1) {
> +		pr_err("%s: Failed to set refcount=1 for new TLS context id=%llu (was %d before inc)\n",
> +		       __func__, ctx->id, atomic_read(&ctx->refcount) - 1);
> +		WARN_ON_ONCE(1);
> +	}
> +
> +	pr_debug(
> +		"blog: successfully created new TLS context id=%llu for pid %d [%s]\n",
> +		ctx->id, ctx->pid, ctx->comm);
> +	return ctx;
> +}
> +EXPORT_SYMBOL(blog_get_tls_ctx);
> +
> +/**
> + * blog_get_source_id - Get or create a source ID for the given location
> + * @logger: Logger instance to use (NULL for global)
> + * @file: Source file name
> + * @func: Function name
> + * @line: Line number
> + * @fmt: Format string
> + *
> + * Returns a unique ID for this source location
> + */
> +u32 blog_get_source_id(struct blog_logger *logger, const char *file,
> +		       const char *func, unsigned int line, const char *fmt)
> +{
> +	u32 id;
> +
> +	if (!logger)
> +		return 0;
> +
> +	id = atomic_inc_return(&logger->next_source_id);
> +
> +	if (id >= BLOG_MAX_SOURCE_IDS) {
> +		/* If we run out of IDs, just use the first one */
> +		pr_warn("blog: source ID overflow, reusing ID 1\n");
> +		id = 1;
> +	}
> +
> +	/* Store the source information in the logger's map */
> +	logger->source_map[id].file = file;
> +	logger->source_map[id].func = func;
> +	logger->source_map[id].line = line;
> +	logger->source_map[id].fmt = fmt;
> +	logger->source_map[id].warn_count = 0;
> +
> +	pr_err("blog_get_source_id: id=%u, file=%s, func=%s, line=%u, fmt=%p (%s)\n",
> +		id, file, func, line, fmt, fmt ? fmt : "NULL");

Do we really need pr_err() here? Maybe, pr_debug()?

> +
> +	return id;
> +}
> +EXPORT_SYMBOL(blog_get_source_id);
> +
> +/**
> + * blog_get_source_info - Get source info for a given ID
> + * @id: Source ID
> + *
> + * Returns the source information for this ID
> + */
> +struct blog_source_info *blog_get_source_info(struct blog_logger *logger, u32 id)
> +{
> +	if (!logger || unlikely(id == 0 || id >= BLOG_MAX_SOURCE_IDS))
> +		return NULL;
> +	return &logger->source_map[id];
> +}
> +EXPORT_SYMBOL(blog_get_source_info);
> +
> +/**
> + * blog_log - Log a message
> + * @source_id: Source ID for this location
> + * @client_id: Client ID for this message (module-specific)
> + * @needed_size: Size needed for the message
> + *
> + * Returns a buffer to write the message into
> + */
> +void *blog_log(struct blog_logger *logger, u32 source_id, u8 client_id, size_t needed_size)
> +{
> +	struct blog_tls_ctx *ctx;
> +	struct blog_log_entry *entry = NULL;
> +	int alloc;
> +	int retry_count = 0;
> +
> +#if BLOG_TRACK_USAGE
> +	struct blog_source_info *source;
> +#endif
> +	/* Preserve payload length; compute rounded total allocation separately */
> +	size_t payload_len = needed_size;
> +
> +	if (payload_len > BLOG_MAX_PAYLOAD) {
> +		pr_warn_once("%s: payload %zu exceeds max %u\n",
> +				__func__, payload_len, BLOG_MAX_PAYLOAD);
> +		return NULL;
> +	}
> +
> +	needed_size = round_up(payload_len + sizeof(struct blog_log_entry), 8);

Why namely 8 has been hardcoded? Maybe, it makes sense to introduce the named
constant?

> +#if BLOG_TRACK_USAGE
> +	/* Get source info to update stats */
> +	source = blog_get_source_info(logger, source_id);
> +	if (unlikely(source)) {
> +		if (in_serving_softirq()) {
> +			atomic_inc(&source->napi_usage);
> +			atomic_add(needed_size, &source->napi_bytes);
> +		} else {
> +			atomic_inc(&source->task_usage);
> +			atomic_add(needed_size, &source->task_bytes);
> +		}
> +	}
> +#endif
> +
> +	while (entry == NULL) {
> +		struct blog_pagefrag *pf;
> +
> +		ctx = blog_get_ctx(logger);
> +		if (!ctx) {
> +			pr_err("Failed to get TLS context\n");
> +			return NULL;
> +		}
> +		if (!blog_is_valid_kernel_addr(ctx)) {
> +			pr_err("%s: invalid TLS context address: %pK\n",
> +			       __func__, ctx);
> +			return NULL;
> +		}
> +		if (unlikely(retry_count)) {
> +			pr_debug(

Why do we place the format string on the next line?

> +				"[%d]Retrying allocation with ctx %llu (%s, pid %d) (retry %d, needed_size=%zu @ %d)\n",
> +				smp_processor_id(), ctx->id, ctx->comm,
> +				ctx->pid, retry_count, needed_size, source_id);
> +		}
> +
> +		pf = blog_ctx_pf(ctx);
> +		alloc = blog_pagefrag_reserve(pf, needed_size);
> +		if (alloc == -ENOMEM) {
> +			pr_debug("%s: allocation failed (needed %zu), resetting context\n",
> +				 __func__, needed_size);
> +			blog_pagefrag_reset(pf);
> +			retry_count++;
> +			if (retry_count > 3) {
> +				pr_err("%s: failed to allocate after 3 retries\n", __func__);
> +				return NULL;
> +			}
> +			continue;
> +		}
> +
> +		entry = blog_pagefrag_get_ptr(pf, alloc);
> +		if (!entry) {
> +			pr_err("%s: failed to get pointer from pagefrag\n", __func__);
> +			return NULL;
> +		}
> +		pf->last_entry = entry;
> +
> +		/* Store pending publish info for blog_log_commit() */
> +		ctx->pending_offset = alloc;
> +		ctx->pending_size = needed_size;
> +	}
> +
> +#if BLOG_DEBUG_POISON
> +	entry->debug_poison = BLOG_LOG_ENTRY_POISON;
> +#endif
> +	entry->ts_delta = (u32)(jiffies - ctx->base_jiffies);
> +	entry->source_id = (u16)source_id;
> +	entry->len = (u8)payload_len;
> +	entry->client_id = client_id;
> +	entry->flags = 0;
> +
> +	pr_err("blog_log: source_id=%u, payload_len=%zu, needed_size=%zu, offset=%d\n",
> +		source_id, payload_len, needed_size, ctx->pending_offset);

Ditto. Why pr_err() here?

> +
> +	/* Debug: Show entry location relative to composite */

If it is debug, then should it be under CONFIG_BLOG_DEBUG?

> +	{
> +		struct blog_tls_pagefrag *composite = blog_ctx_container(ctx);
> +		size_t entry_offset_from_composite = (char *)entry - (char *)composite;
> +		size_t buffer_offset_from_composite = (char *)composite->buf - (char *)composite;
> +		pr_err("blog_log: composite=%p, entry=%p, entry_offset=%zu, buffer_start_offset=%zu\n",
> +			composite, entry, entry_offset_from_composite, buffer_offset_from_composite);
> +	}
> +
> +	return entry->buffer;
> +}
> +EXPORT_SYMBOL(blog_log);
> +
> +/**
> + * blog_log_commit - Publish a reserved log entry
> + * @logger: Logger instance
> + * @actual_size: Actual bytes written during serialization
> + *
> + * Publishes the log entry that was reserved by the last blog_log() call.
> + * Must be called after serialization is complete to make the entry visible
> + * to readers.
> + *
> + * Context: Same context as the preceding blog_log() call
> + * Return: 0 on success, negative error code on failure
> + */
> +int blog_log_commit(struct blog_logger *logger, size_t actual_size)
> +{
> +	struct blog_tls_ctx *ctx = blog_get_ctx(logger);
> +	struct blog_pagefrag *pf;
> +	struct blog_log_entry *entry;
> +	size_t total_size;
> +
> +	if (!ctx)
> +		return -EINVAL;
> +
> +	pf = blog_ctx_pf(ctx);
> +
> +	/* Get the entry we're committing and update its length to actual size */
> +	entry = blog_pagefrag_get_ptr(pf, ctx->pending_offset);
> +	if (entry)
> +		entry->len = (u8)actual_size;
> +
> +	/* actual_size is payload only, need to add header + alignment */
> +	total_size = round_up(sizeof(struct blog_log_entry) + actual_size, 8);

Ditto. Hardcoded 8?

> +
> +	pr_err("blog_log_commit: pending_offset=%d, actual_size=%zu, total_size=%zu, pending_size=%zu\n",
> +		ctx->pending_offset, actual_size, total_size, ctx->pending_size);

Ditto. pr_err()?

> +
> +	blog_pagefrag_publish(pf, ctx->pending_offset + total_size);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(blog_log_commit);
> +
> +/**
> + * blog_get_napi_ctx - Get NAPI context for current CPU
> + */
> +struct blog_tls_ctx *blog_get_napi_ctx(struct blog_logger *logger)
> +{
> +	struct blog_tls_ctx **napi_ctx_ptr;
> +
> +	if (!logger || !logger->napi_ctxs)
> +		return NULL;
> +
> +	napi_ctx_ptr = per_cpu_ptr(logger->napi_ctxs, smp_processor_id());
> +	return napi_ctx_ptr ? *napi_ctx_ptr : NULL;
> +}
> +EXPORT_SYMBOL(blog_get_napi_ctx);
> +
> +/**
> + * blog_set_napi_ctx - Set NAPI context for current CPU
> + */
> +void blog_set_napi_ctx(struct blog_logger *logger, struct blog_tls_ctx *ctx)
> +{
> +	struct blog_tls_ctx **napi_ctx_ptr;
> +
> +	if (!logger || !logger->napi_ctxs)
> +		return;
> +
> +	napi_ctx_ptr = per_cpu_ptr(logger->napi_ctxs, smp_processor_id());
> +	if (napi_ctx_ptr)
> +		*napi_ctx_ptr = ctx;
> +}
> +EXPORT_SYMBOL(blog_set_napi_ctx);
> +
> +/**
> + * blog_get_ctx - Get appropriate context based on context type
> + */
> +struct blog_tls_ctx *blog_get_ctx(struct blog_logger *logger)
> +{
> +	if (in_serving_softirq()) {
> +		struct blog_tls_ctx *n = blog_get_napi_ctx(logger);
> +
> +		if (n)
> +			return n;
> +		/* Fallback to TLS context if no NAPI context set */
> +	}
> +	return blog_get_tls_ctx(logger);
> +}
> +EXPORT_SYMBOL(blog_get_ctx);
> +
> +/**
> + * blog_log_iter_init - Initialize the iterator for a specific pagefrag
> + */
> +void blog_log_iter_init(struct blog_log_iter *iter, struct blog_pagefrag *pf,
> +			u64 head_snapshot)
> +{
> +	if (!iter || !pf)
> +		return;
> +
> +	iter->pf = pf;
> +	iter->current_offset = 0;
> +	iter->end_offset = head_snapshot;
> +	iter->prev_offset = 0;
> +	iter->steps = 0;
> +}
> +EXPORT_SYMBOL(blog_log_iter_init);
> +
> +/**
> + * blog_log_iter_next - Get next log entry
> + */
> +struct blog_log_entry *blog_log_iter_next(struct blog_log_iter *iter)
> +{
> +	struct blog_log_entry *entry;
> +
> +	if (!iter || iter->current_offset >= iter->end_offset)
> +		return NULL;
> +
> +	entry = blog_pagefrag_get_ptr(iter->pf, iter->current_offset);
> +	if (!entry)
> +		return NULL;
> +
> +	iter->prev_offset = iter->current_offset;
> +	iter->current_offset +=
> +		round_up(sizeof(struct blog_log_entry) + entry->len, 8);

Ditto. Hardcoded 8?

> +	iter->steps++;
> +
> +	return entry;
> +}
> +EXPORT_SYMBOL(blog_log_iter_next);
> +
> +/**
> + * blog_des_entry - Deserialize entry with callback
> + */
> +int blog_des_entry(struct blog_logger *logger, struct blog_log_entry *entry,
> +		   char *output, size_t out_size, blog_client_des_fn client_cb)
> +{
> +	int len = 0;
> +	struct blog_source_info *source;
> +
> +	if (!entry || !output)
> +		return -EINVAL;
> +
> +	/* Let module handle client_id if callback provided */
> +	if (client_cb) {
> +		len = client_cb(output, out_size, entry->client_id);
> +		if (len < 0)
> +			return len;
> +	}
> +
> +	/* Get source info */
> +	source = blog_get_source_info(logger, entry->source_id);
> +	if (!source) {
> +		len += snprintf(output + len, out_size - len,

I see output + len and out_size - len multiple times in function. Maybe, it
makes sense to calculate before providing as argument? Also, I am worried about
improper len values. Does it make sense to check the len value?

> +				"[unknown source %u]", entry->source_id);
> +		return len;
> +	}
> +
> +	pr_err("blog_des_entry: source_id=%u, source=%p, fmt=%p (%s), entry->len=%u\n",
> +		entry->source_id, source, source->fmt, source->fmt ? source->fmt : "NULL", entry->len);

Ditto. pr_err()?

> +
> +	/* Debug: Show entry pointer and payload pointer */
> +	pr_err("blog_des_entry: entry=%p, entry->buffer=%p, buffer_offset=%ld\n",
> +		entry, entry->buffer, (char *)entry->buffer - (char *)entry);

Ditto. pr_err()? If it is debug, then, maybe, it should be under
CONFIG_BLOG_DEBUG?

> +
> +	/* Add source location */
> +	len += snprintf(output + len, out_size - len, "[%s:%s:%u] ",
> +			source->file, source->func, source->line);
> +
> +	/* Deserialize the buffer content */
> +	len += blog_des_reconstruct(source->fmt, entry->buffer, 0, entry->len,
> +				    output + len, out_size - len);
> +
> +	return len;
> +}
> +EXPORT_SYMBOL(blog_des_entry);
> +
> +/* No global init/exit: consumers initialize per‑module contexts explicitly */
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Binary Logging Infrastructure (BLOG)");

MODULE_AUTHOR()?

> diff --git a/lib/blog/blog_des.c b/lib/blog/blog_des.c
> new file mode 100644
> index 000000000000..661d5b7c2f10
> --- /dev/null
> +++ b/lib/blog/blog_des.c
> @@ -0,0 +1,385 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Binary Logging Deserialization
> + *
> + * Migrated from ceph_san_des.c with all algorithms preserved
> + */
> +
> +#include <linux/blog/blog_des.h>
> +#include <linux/blog/blog.h>
> +#include <linux/string.h>
> +#include <linux/ctype.h>
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/printk.h>
> +#include <linux/align.h>
> +#include <linux/unaligned.h>
> +
> +/**
> + * blog_des_reconstruct - Reconstructs a formatted string from serialized values
> + * @fmt: Format string containing % specifiers
> + * @buffer: Buffer containing serialized values
> + * @nr_args: Number of arguments to process (not used yet, for future)
> + * @size: Size of the buffer in bytes
> + * @out: Buffer to store the reconstructed string
> + * @out_size: Size of the output buffer
> + *
> + * Return: Number of bytes written to out buffer, or negative error code on failure
> + */
> +int blog_des_reconstruct(const char *fmt, const void *buffer, size_t nr_args,
> +			  size_t size, char *out, size_t out_size)
> +{
> +	const char *buf_start = (const char *)buffer;
> +	const char *buf_ptr = buf_start;
> +	const char *buf_end = buf_start + size;
> +	const char *fmt_ptr = fmt;
> +	char *out_ptr = out;
> +	size_t remaining = out_size - 1; /* Reserve space for null terminator */
> +	size_t arg_count = 0;
> +	int ret;
> +
> +	if (!fmt || !buffer || !out || !out_size) {
> +		pr_err("blog_des_reconstruct: invalid params fmt=%p buffer=%p out=%p out_size=%zu\n",
> +			fmt, buffer, out, out_size);
> +		return -EINVAL;
> +	}
> +
> +	*out_ptr = '\0';
> +
> +	/* Process the format string */
> +	while (*fmt_ptr && remaining > 0) {
> +		int is_long;
> +		int is_long_long;
> +
> +		if (*fmt_ptr != '%') {
> +			/* Copy literal character */
> +			*out_ptr++ = *fmt_ptr++;
> +			remaining--;
> +			continue;
> +		}
> +
> +		/* Skip the '%' */
> +		fmt_ptr++;
> +
> +		/* Handle %% */
> +		if (*fmt_ptr == '%') {
> +			*out_ptr++ = '%';
> +			fmt_ptr++;
> +			remaining--;
> +			continue;
> +		}
> +
> +		/* Skip flags (-+#0 space) */
> +		while (*fmt_ptr && (*fmt_ptr == '-' || *fmt_ptr == '+' || *fmt_ptr == '#' ||
> +				   *fmt_ptr == '0' || *fmt_ptr == ' ')) {
> +			fmt_ptr++;
> +		}
> +
> +		/* Skip field width (digits or *) */
> +		while (*fmt_ptr && (*fmt_ptr >= '0' && *fmt_ptr <= '9'))
> +			fmt_ptr++;
> +		if (*fmt_ptr == '*')
> +			fmt_ptr++;
> +
> +		/* Skip precision (.digits or .*) */
> +		if (*fmt_ptr == '.') {
> +			fmt_ptr++;
> +			while (*fmt_ptr && (*fmt_ptr >= '0' && *fmt_ptr <= '9'))
> +				fmt_ptr++;
> +			if (*fmt_ptr == '*')
> +				fmt_ptr++;
> +		}
> +
> +		/* Parse length modifiers (l, ll, h, hh, z) */
> +		is_long = 0;
> +		is_long_long = 0;
> +
> +		if (*fmt_ptr == 'l') {
> +			fmt_ptr++;
> +			is_long = 1;
> +			if (*fmt_ptr == 'l') {
> +				fmt_ptr++;
> +				is_long_long = 1;
> +				is_long = 0;
> +			}
> +		} else if (*fmt_ptr == 'h') {
> +			fmt_ptr++;
> +			if (*fmt_ptr == 'h')
> +				fmt_ptr++;
> +		} else if (*fmt_ptr == 'z') {
> +			fmt_ptr++;
> +		}
> +
> +		/* Parse and handle format specifier */
> +		switch (*fmt_ptr) {
> +		case 's': { /* String (inline) */
> +			const char *str;
> +			size_t str_len;
> +			size_t max_scan_len;
> +
> +			if (buf_ptr >= buf_end) {
> +				pr_err("blog_des_reconstruct: string arg %zu overruns buffer (no space)\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +
> +			/* String is stored inline in buffer */
> +			str = buf_ptr;
> +
> +			/* Calculate maximum safe length to scan for null terminator */
> +			max_scan_len = buf_end - buf_ptr;
> +
> +			/* Find string length with bounds checking */
> +			str_len = strnlen(str, max_scan_len);
> +			if (str_len == max_scan_len && str[str_len - 1] != '\0') {
> +				pr_err("blog_des_reconstruct: unterminated string at arg %zu (fmt=%s)\n",
> +					arg_count, fmt);
> +				return -EFAULT;
> +			}
> +
> +			/* Advance buffer pointer with proper alignment */
> +			buf_ptr += round_up(str_len + 1, 4);
> +
> +			if (buf_ptr > buf_end) {
> +				pr_err("blog_des_reconstruct: string arg %zu overruns buffer after copy (fmt=%s)\n",
> +					arg_count, fmt);
> +				return -EFAULT;
> +			}
> +
> +			/* Copy string to output with bounds checking */
> +			if (str_len > remaining)
> +				str_len = remaining;
> +			memcpy(out_ptr, str, str_len);
> +			out_ptr += str_len;
> +			remaining -= str_len;
> +			break;
> +		}
> +
> +	case 'd': case 'i': { /* Integer */
> +		if (is_long_long) {
> +			long long val;
> +
> +			if (buf_ptr + sizeof(long long) > buf_end) {
> +				pr_err("blog_des_reconstruct: long long arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((long long *)buf_ptr);
> +			buf_ptr += sizeof(long long);
> +			ret = snprintf(out_ptr, remaining, "%lld", val);
> +		} else if (is_long) {
> +			long val;
> +
> +			if (buf_ptr + sizeof(long) > buf_end) {
> +				pr_err("blog_des_reconstruct: long arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((long *)buf_ptr);
> +			buf_ptr += sizeof(long);
> +			ret = snprintf(out_ptr, remaining, "%ld", val);
> +		} else {
> +			int val;
> +
> +			if (buf_ptr + sizeof(int) > buf_end) {
> +				pr_err("blog_des_reconstruct: int arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((int *)buf_ptr);
> +			buf_ptr += sizeof(int);
> +			ret = snprintf(out_ptr, remaining, "%d", val);
> +		}
> +
> +			if (ret > 0) {
> +				if (ret > remaining)
> +					ret = remaining;
> +				out_ptr += ret;
> +				remaining -= ret;
> +			}
> +			break;
> +		}
> +
> +	case 'u': { /* Unsigned integer */
> +		if (is_long_long) {
> +			unsigned long long val;
> +
> +			if (buf_ptr + sizeof(unsigned long long) > buf_end) {
> +				pr_err("blog_des_reconstruct: unsigned long long arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((unsigned long long *)buf_ptr);
> +			buf_ptr += sizeof(unsigned long long);
> +			ret = snprintf(out_ptr, remaining, "%llu", val);
> +		} else if (is_long) {
> +			unsigned long val;
> +
> +			if (buf_ptr + sizeof(unsigned long) > buf_end) {
> +				pr_err("blog_des_reconstruct: unsigned long arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((unsigned long *)buf_ptr);
> +			buf_ptr += sizeof(unsigned long);
> +			ret = snprintf(out_ptr, remaining, "%lu", val);
> +		} else {
> +			unsigned int val;
> +
> +			if (buf_ptr + sizeof(unsigned int) > buf_end) {
> +				pr_err("blog_des_reconstruct: unsigned int arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((unsigned int *)buf_ptr);
> +			buf_ptr += sizeof(unsigned int);
> +			ret = snprintf(out_ptr, remaining, "%u", val);
> +		}
> +
> +			if (ret > 0) {
> +				if (ret > remaining)
> +					ret = remaining;
> +				out_ptr += ret;
> +				remaining -= ret;
> +			}
> +			break;
> +		}
> +
> +	case 'x': case 'X': { /* Hex integer */
> +		const char *hex_fmt;
> +
> +		if (*fmt_ptr == 'x')
> +			hex_fmt = is_long_long ? "%llx" : is_long ? "%lx" : "%x";
> +		else
> +			hex_fmt = is_long_long ? "%llX" : is_long ? "%lX" : "%X";
> +
> +		if (is_long_long) {
> +			unsigned long long val;
> +
> +			if (buf_ptr + sizeof(unsigned long long) > buf_end) {
> +				pr_err("blog_des_reconstruct: hex unsigned long long arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((unsigned long long *)buf_ptr);
> +			buf_ptr += sizeof(unsigned long long);
> +			ret = snprintf(out_ptr, remaining, hex_fmt, val);
> +		} else if (is_long) {
> +			unsigned long val;
> +
> +			if (buf_ptr + sizeof(unsigned long) > buf_end) {
> +				pr_err("blog_des_reconstruct: hex unsigned long arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((unsigned long *)buf_ptr);
> +			buf_ptr += sizeof(unsigned long);
> +			ret = snprintf(out_ptr, remaining, hex_fmt, val);
> +		} else {
> +			unsigned int val;
> +
> +			if (buf_ptr + sizeof(unsigned int) > buf_end) {
> +				pr_err("blog_des_reconstruct: hex unsigned int arg %zu overruns buffer\n",
> +					arg_count);
> +				return -EFAULT;
> +			}
> +			val = get_unaligned((unsigned int *)buf_ptr);
> +			buf_ptr += sizeof(unsigned int);
> +			ret = snprintf(out_ptr, remaining, hex_fmt, val);
> +		}
> +
> +			if (ret > 0) {
> +				if (ret > remaining)
> +					ret = remaining;
> +				out_ptr += ret;
> +				remaining -= ret;
> +			}
> +			break;
> +		}
> +
> +	case 'p': { /* Pointer */
> +		void *ptr;
> +
> +		if (buf_ptr + sizeof(void *) > buf_end) {
> +			pr_err("blog_des_reconstruct: pointer arg %zu overruns buffer\n",
> +				arg_count);
> +			return -EFAULT;
> +		}
> +
> +		ptr = (void *)(unsigned long)get_unaligned((unsigned long *)buf_ptr);
> +		buf_ptr += sizeof(void *);
> +
> +		ret = snprintf(out_ptr, remaining, "%p", ptr);
> +		if (ret > 0) {
> +			if (ret > remaining)
> +				ret = remaining;
> +			out_ptr += ret;
> +			remaining -= ret;
> +		}
> +		break;
> +	}
> +
> +	case 'c': { /* Character */
> +		char val;
> +
> +		if (buf_ptr + sizeof(int) > buf_end) { /* chars are promoted to int */
> +			pr_err("blog_des_reconstruct: char arg %zu overruns buffer\n",
> +				arg_count);
> +			return -EFAULT;
> +		}
> +
> +		val = (char)get_unaligned((int *)buf_ptr);
> +		buf_ptr += sizeof(int);
> +
> +		if (remaining > 0) {
> +			*out_ptr++ = val;
> +			remaining--;
> +		}
> +		break;
> +	}
> +
> +		default:
> +			pr_err("%s: unsupported format specifier '%%%c' at argument %zu\n",
> +		       __func__, *fmt_ptr, arg_count);
> +			return -EINVAL;
> +		}
> +
> +		fmt_ptr++;
> +		arg_count++;
> +	}
> +
> +	/* Null-terminate the output */
> +	*out_ptr = '\0';
> +
> +	return out_ptr - out;
> +}
> +EXPORT_SYMBOL(blog_des_reconstruct);
> +
> +/**
> + * blog_log_reconstruct - Reconstructs a formatted string from a log entry
> + * @entry: Log entry containing serialized data
> + * @output: Buffer to write the formatted string to
> + * @output_size: Size of the output buffer
> + *
> + * This reconstructs the log message but does NOT handle client_id.
> + * The caller should handle client_id separately using their module-specific callback.
> + *
> + * Return: Length of formatted string, or negative error code on failure
> + */
> +int blog_log_reconstruct(struct blog_logger *logger, const struct blog_log_entry *entry,
> +			char *output, size_t output_size)
> +{
> +	struct blog_source_info *source;
> +
> +	if (!entry || !output || !logger)
> +		return -EINVAL;
> +
> +	/* Get source info */
> +	source = blog_get_source_info(logger, entry->source_id);
> +	if (!source)
> +		return snprintf(output, output_size, "[unknown source %u]", entry->source_id);
> +
> +	/* Reconstruct using the format string from source */
> +	return blog_des_reconstruct(source->fmt, entry->buffer, 0, entry->len,
> +				   output, output_size);
> +}
> diff --git a/lib/blog/blog_module.c b/lib/blog/blog_module.c
> new file mode 100644
> index 000000000000..946393659036
> --- /dev/null
> +++ b/lib/blog/blog_module.c
> @@ -0,0 +1,781 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Binary Logging Infrastructure (BLOG) - Per-Module Support
> + *
> + * Implements per-module context management for isolated logging.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +#include <linux/list.h>
> +#include <linux/atomic.h>
> +#include <linux/sched.h>
> +#include <linux/sched/signal.h>
> +#include <linux/sched/task.h>
> +#include <linux/bitops.h>
> +#include <linux/blog/blog.h>
> +#include <linux/blog/blog_module.h>
> +
> +/* Global list of all module contexts */
> +static LIST_HEAD(blog_module_contexts);
> +static DEFINE_SPINLOCK(blog_modules_lock);
> +
> +/* Keep in sync with blog_core.c */
> +#define BLOG_LOG_BATCH_MAX_FULL 16
> +
> +/* Global module registry */
> +static struct blog_module_registry blog_registry = {
> +	.modules = { NULL },
> +	.allocated_bitmap = 0,
> +	.lock = __SPIN_LOCK_UNLOCKED(blog_registry.lock),
> +	.module_count = ATOMIC_INIT(0),
> +};
> +
> +/**
> + * blog_module_register - Register a module and allocate a slot
> + * @module_name: Name of the module
> + *
> + * Registers a module in the global registry and assigns it a slot ID (0-7).
> + * The slot ID is used to index into each task's blog_contexts array.
> + *
> + * Return: Module context on success, NULL if no slots available
> + */
> +struct blog_module_context *blog_module_register(const char *module_name)
> +{
> +	struct blog_module_context *ctx;
> +	unsigned long flags;
> +	int slot;
> +	size_t name_len;
> +
> +	if (!module_name) {
> +		pr_err("blog: module name is NULL\n");
> +		return NULL;
> +	}
> +
> +	name_len = strlen(module_name);
> +	if (name_len == 0) {
> +		pr_err("blog: module name is empty\n");
> +		return NULL;
> +	}
> +
> +	if (name_len >= 32) {  /* sizeof(blog_module_context.name) */

What's about introduce the named constant instead of hardcoded 32 value?

> +		pr_err("blog: module name too long: '%s' (max 31 chars)\n",
> +		       module_name);
> +		return NULL;
> +	}
> +
> +	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> +	if (!ctx)
> +		return NULL;
> +
> +	spin_lock_irqsave(&blog_registry.lock, flags);
> +
> +	/* Find first free slot */
> +	slot = find_first_zero_bit((unsigned long *)&blog_registry.allocated_bitmap,
> +				    BLOG_MAX_MODULES);
> +	if (slot >= BLOG_MAX_MODULES) {
> +		spin_unlock_irqrestore(&blog_registry.lock, flags);
> +		kfree(ctx);
> +		pr_err("blog: no free slots available (max %d modules)\n",
> +		       BLOG_MAX_MODULES);
> +		return NULL;
> +	}
> +
> +	/* Claim the slot */
> +	set_bit(slot, (unsigned long *)&blog_registry.allocated_bitmap);
> +	blog_registry.modules[slot] = ctx;
> +	atomic_inc(&blog_registry.module_count);
> +
> +	spin_unlock_irqrestore(&blog_registry.lock, flags);
> +
> +	/* Initialize the context */
> +	strscpy(ctx->name, module_name, sizeof(ctx->name));
> +	ctx->slot_id = slot;
> +	atomic_set(&ctx->refcount, 1);
> +	atomic_set(&ctx->allocated_contexts, 0);
> +	INIT_LIST_HEAD(&ctx->list);
> +
> +	pr_info("blog: module '%s' registered with slot %d\n", module_name, slot);
> +
> +	return ctx;
> +}
> +EXPORT_SYMBOL(blog_module_register);
> +
> +/**
> + * blog_module_unregister - Unregister a module and free its slot
> + * @ctx: Module context to unregister
> + *
> + * Removes the module from the global registry and frees its slot.
> + * Also cleans up any active task contexts for this module.
> + */
> +void blog_module_unregister(struct blog_module_context *ctx)
> +{
> +	unsigned long flags;
> +	int slot;
> +
> +	if (!ctx)
> +		return;
> +
> +	slot = ctx->slot_id;
> +	if (slot >= BLOG_MAX_MODULES)
> +		goto out_free;
> +
> +	spin_lock_irqsave(&blog_registry.lock, flags);
> +	if (test_bit(slot, (unsigned long *)&blog_registry.allocated_bitmap)) {
> +		clear_bit(slot, (unsigned long *)&blog_registry.allocated_bitmap);
> +		blog_registry.modules[slot] = NULL;
> +		atomic_dec(&blog_registry.module_count);
> +	}
> +	spin_unlock_irqrestore(&blog_registry.lock, flags);
> +
> +	pr_info("blog: module '%s' unregistered from slot %d\n", ctx->name, slot);
> +
> +out_free:
> +	kfree(ctx);
> +}
> +EXPORT_SYMBOL(blog_module_unregister);
> +
> +/* Release hook for per-module TLS contexts */
> +static void blog_module_clear_task(struct blog_tls_ctx *ctx)
> +{
> +	struct task_struct *task;
> +
> +	if (!ctx)
> +		return;
> +
> +	task = READ_ONCE(ctx->task);
> +	if (task) {
> +		WRITE_ONCE(ctx->task, NULL);
> +		put_task_struct(task);
> +	}
> +}
> +
> +static void blog_module_rebalance_log_batch(struct blog_logger *logger)
> +{
> +	struct blog_magazine *mag;
> +
> +	if (!logger)
> +		return;
> +
> +	if (logger->log_batch.nr_full <= BLOG_LOG_BATCH_MAX_FULL)
> +		return;
> +
> +	spin_lock(&logger->log_batch.full_lock);
> +	if (list_empty(&logger->log_batch.full_magazines)) {
> +		spin_unlock(&logger->log_batch.full_lock);
> +		return;
> +	}
> +
> +	mag = list_first_entry(&logger->log_batch.full_magazines,
> +			      struct blog_magazine, list);
> +	list_del(&mag->list);
> +	logger->log_batch.nr_full--;
> +	spin_unlock(&logger->log_batch.full_lock);
> +
> +	spin_lock(&logger->alloc_batch.full_lock);
> +	list_add(&mag->list, &logger->alloc_batch.full_magazines);
> +	logger->alloc_batch.nr_full++;
> +	spin_unlock(&logger->alloc_batch.full_lock);
> +}
> +
> +static void blog_module_queue_to_alloc_batch(struct blog_logger *logger,
> +						 struct blog_tls_ctx *ctx)
> +{
> +	struct blog_tls_pagefrag *composite;
> +
> +	if (!logger || !ctx)
> +		return;
> +
> +	composite = blog_ctx_container(ctx);
> +	atomic_set(&ctx->refcount, 0);
> +	blog_batch_put(&logger->alloc_batch, composite);
> +}
> +
> +static void blog_module_queue_to_log_batch(struct blog_logger *logger,
> +					       struct blog_tls_ctx *ctx)
> +{
> +	struct blog_tls_pagefrag *composite;
> +
> +	if (!logger || !ctx)
> +		return;
> +
> +	composite = blog_ctx_container(ctx);
> +	atomic_set(&ctx->refcount, 0);
> +	blog_batch_put(&logger->log_batch, composite);
> +	blog_module_rebalance_log_batch(logger);
> +}
> +
> +static void blog_module_tls_release(void *ptr)
> +{
> +	struct blog_tls_ctx *ctx = ptr;
> +	struct blog_logger *logger;
> +
> +	if (!ctx)
> +		return;
> +
> +	logger = ctx->logger;
> +	if (!logger) {
> +		pr_err("BUG: TLS context id=%llu has no logger for release\n",
> +		       ctx->id);
> +		return;
> +	}
> +
> +	/* Clear task association */
> +	blog_module_clear_task(ctx);
> +
> +	/* Recycle composite to log_batch - no freeing!
> +	 * The composite will be recycled back to alloc_batch and reused. */
> +	blog_module_queue_to_log_batch(logger, ctx);
> +}
> +
> +/**
> + * blog_module_init - Initialize a per-module BLOG context
> + * @module_name: Name of the module
> + *
> + * Creates an isolated logging context for a specific module.
> + *
> + * Return: Module context on success, NULL on failure
> + */
> +struct blog_module_context *blog_module_init(const char *module_name)
> +{
> +	struct blog_module_context *ctx;
> +	struct blog_logger *logger;
> +	int i;
> +	int ret;
> +
> +	ctx = blog_module_register(module_name);
> +	if (!ctx)
> +		return NULL;
> +
> +	logger = kzalloc(sizeof(*logger), GFP_KERNEL);
> +	if (!logger)
> +		goto err_unregister;
> +
> +	logger->magazine_cache = kmem_cache_create("blog_magazine",
> +					      sizeof(struct blog_magazine),
> +					      0, SLAB_HWCACHE_ALIGN, NULL);
> +	if (!logger->magazine_cache)
> +		goto err_logger;
> +
> +	/* Initialize module context */
> +	strscpy(ctx->name, module_name, sizeof(ctx->name));
> +	ctx->logger = logger;
> +	atomic_set(&ctx->refcount, 1);
> +	INIT_LIST_HEAD(&ctx->list);
> +
> +	/* Initialize logger instance */
> +	INIT_LIST_HEAD(&logger->contexts);
> +	spin_lock_init(&logger->lock);
> +	spin_lock_init(&logger->source_lock);
> +	spin_lock_init(&logger->ctx_id_lock);
> +	atomic_set(&logger->next_source_id, 1);
> +	logger->next_ctx_id = 1;
> +	logger->total_contexts_allocated = 0;
> +	logger->slot_id = ctx->slot_id;
> +	logger->has_slot = true;
> +	logger->owner_ctx = ctx;
> +
> +	/* Initialize alloc_batch with preallocation (ready-to-use composites) */
> +	ret = blog_batch_init(&logger->alloc_batch, logger->magazine_cache, true);
> +	if (ret)
> +		goto err_cache;
> +
> +	/* Initialize log_batch empty (no preallocation - receives used composites) */
> +	ret = blog_batch_init(&logger->log_batch, logger->magazine_cache, false);
> +	if (ret)
> +		goto err_batch_alloc;
> +
> +	/* Initialize source map */
> +	for (i = 0; i < BLOG_MAX_SOURCE_IDS; i++)
> +		memset(&logger->source_map[i], 0, sizeof(logger->source_map[i]));
> +
> +	/* Allocate per-CPU NAPI context pointers */
> +	logger->napi_ctxs = alloc_percpu(struct blog_tls_ctx *);
> +	if (!logger->napi_ctxs)
> +		goto err_batches;
> +
> +	/* Add to global list */
> +	spin_lock(&blog_modules_lock);
> +	list_add(&ctx->list, &blog_module_contexts);
> +	spin_unlock(&blog_modules_lock);
> +
> +	pr_info("BLOG: Module context initialized for %s\n", module_name);
> +	ctx->initialized = true;
> +	return ctx;
> +
> +err_batches:
> +	blog_batch_cleanup(&logger->log_batch);
> +err_batch_alloc:
> +	blog_batch_cleanup(&logger->alloc_batch);
> +err_cache:
> +	if (logger->magazine_cache) {
> +		kmem_cache_destroy(logger->magazine_cache);
> +		logger->magazine_cache = NULL;
> +	}
> +err_logger:
> +	kfree(logger);
> +err_unregister:
> +	blog_module_unregister(ctx);
> +	return NULL;
> +}
> +EXPORT_SYMBOL(blog_module_init);
> +
> +/**
> + * blog_module_cleanup - Clean up a module's BLOG context
> + * @ctx: Module context to clean up
> + */
> +void blog_module_cleanup(struct blog_module_context *ctx)
> +{
> +	struct blog_logger *logger;
> +	struct blog_tls_ctx *tls_ctx, *tmp;
> +	LIST_HEAD(pending);
> +	int slot;
> +
> +	if (!ctx || !ctx->initialized)
> +		return;
> +
> +	logger = ctx->logger;
> +	if (!logger)
> +		return;
> +
> +	slot = ctx->slot_id;
> +
> +	/* Remove from global list */
> +	spin_lock(&blog_modules_lock);
> +	list_del(&ctx->list);
> +	spin_unlock(&blog_modules_lock);
> +
> +	/* Detach contexts under lock, release outside */
> +	spin_lock(&logger->lock);
> +	list_for_each_entry_safe(tls_ctx, tmp, &logger->contexts, list)
> +		list_move(&tls_ctx->list, &pending);
> +	spin_unlock(&logger->lock);
> +
> +	list_for_each_entry_safe(tls_ctx, tmp, &pending, list) {
> +		struct task_struct *task = tls_ctx->task;
> +
> +		list_del_init(&tls_ctx->list);
> +		if (task && slot < BLOG_MAX_MODULES) {
> +			task_lock(task);
> +			if (READ_ONCE(task->blog_contexts[slot]) == tls_ctx)
> +				WRITE_ONCE(task->blog_contexts[slot], NULL);
> +			task_unlock(task);
> +		}
> +		if (atomic_dec_if_positive(&ctx->allocated_contexts) < 0)
> +			atomic_set(&ctx->allocated_contexts, 0);
> +
> +		/* Clear task association */
> +		blog_module_clear_task(tls_ctx);
> +
> +		/* Recycle composite to log_batch instead of freeing */
> +		if (tls_ctx->release) {
> +			tls_ctx->release(tls_ctx);
> +		} else if (logger) {
> +			blog_module_queue_to_log_batch(logger, tls_ctx);
> +		}
> +	}
> +
> +	/* Clean up batches (each has its own magazine_cache, will be destroyed) */
> +	blog_batch_cleanup(&logger->alloc_batch);
> +	blog_batch_cleanup(&logger->log_batch);
> +
> +	/* Destroy shared magazine cache */
> +	if (logger->magazine_cache) {
> +		kmem_cache_destroy(logger->magazine_cache);
> +		logger->magazine_cache = NULL;
> +	}
> +
> +	/* Free per-CPU NAPI contexts */
> +	if (logger->napi_ctxs)
> +		free_percpu(logger->napi_ctxs);
> +
> +	pr_info("BLOG: Module context cleaned up for %s\n", ctx->name);
> +
> +	kfree(logger);
> +	ctx->logger = NULL;
> +	ctx->initialized = false;
> +	ctx->slot_id = 0;
> +
> +	/* Unregister to free the slot */
> +	blog_module_unregister(ctx);
> +}
> +EXPORT_SYMBOL(blog_module_cleanup);
> +
> +/**
> + * blog_module_get - Increment module context reference count
> + * @ctx: Module context
> + */
> +void blog_module_get(struct blog_module_context *ctx)
> +{
> +	if (ctx)
> +		atomic_inc(&ctx->refcount);
> +}
> +EXPORT_SYMBOL(blog_module_get);
> +
> +/**
> + * blog_module_put - Decrement module context reference count
> + * @ctx: Module context
> + */
> +void blog_module_put(struct blog_module_context *ctx)
> +{
> +	if (ctx && atomic_dec_and_test(&ctx->refcount))
> +		blog_module_cleanup(ctx);
> +}
> +EXPORT_SYMBOL(blog_module_put);
> +
> +/* Per-module API implementations */
> +
> +/**
> + * blog_get_source_id_ctx - Get or allocate source ID for a module context
> + * @ctx: Module context
> + * @file: Source file name
> + * @func: Function name
> + * @line: Line number
> + * @fmt: Format string
> + *
> + * Return: Source ID
> + */
> +u32 blog_get_source_id_ctx(struct blog_module_context *ctx, const char *file,
> +			   const char *func, unsigned int line, const char *fmt)
> +{
> +	struct blog_logger *logger;
> +	struct blog_source_info *info;
> +	u32 id;
> +
> +	if (!ctx || !ctx->logger)
> +		return 0;
> +
> +	logger = ctx->logger;
> +
> +	/* Get next ID */
> +	id = atomic_fetch_inc(&logger->next_source_id);
> +	if (id >= BLOG_MAX_SOURCE_IDS) {
> +		pr_warn("BLOG: Source ID overflow in module %s\n", ctx->name);
> +		return 0;
> +	}
> +
> +	/* Fill in source info */
> +	spin_lock(&logger->source_lock);
> +	info = &logger->source_map[id];
> +	info->file = file;
> +	info->func = func;
> +	info->line = line;
> +	info->fmt = fmt;
> +	info->warn_count = 0;
> +#if BLOG_TRACK_USAGE
> +	atomic_set(&info->napi_usage, 0);
> +	atomic_set(&info->task_usage, 0);
> +	atomic_set(&info->napi_bytes, 0);
> +	atomic_set(&info->task_bytes, 0);
> +#endif
> +	spin_unlock(&logger->source_lock);
> +
> +	return id;
> +}
> +EXPORT_SYMBOL(blog_get_source_id_ctx);
> +
> +/**
> + * blog_get_source_info_ctx - Get source info for an ID in a module context
> + * @ctx: Module context
> + * @id: Source ID
> + *
> + * Return: Source info or NULL
> + */
> +struct blog_source_info *blog_get_source_info_ctx(struct blog_module_context *ctx, u32 id)
> +{
> +	struct blog_logger *logger;
> +
> +	if (!ctx || !ctx->logger || id >= BLOG_MAX_SOURCE_IDS)
> +		return NULL;
> +
> +	logger = ctx->logger;
> +	return &logger->source_map[id];
> +}
> +EXPORT_SYMBOL(blog_get_source_info_ctx);
> +
> +/**
> + * blog_get_tls_ctx_ctx - Get or create TLS context for a module
> + * @ctx: Module context
> + *
> + * Uses slot-based access into task_struct's blog_contexts array.
> + * Allocates composites from the magazine batch system.
> + *
> + * Return: TLS context or NULL
> + */
> +struct blog_tls_ctx *blog_get_tls_ctx_ctx(struct blog_module_context *ctx)
> +{
> +	struct blog_logger *logger;
> +	struct blog_tls_ctx *tls_ctx;
> +	struct blog_tls_pagefrag *composite;
> +	struct blog_pagefrag *pf;
> +	struct task_struct *task = current;
> +	u8 slot_id;
> +
> +	if (!ctx || !ctx->logger)
> +		return NULL;
> +
> +	logger = ctx->logger;
> +	slot_id = ctx->slot_id;
> +
> +	if (slot_id >= BLOG_MAX_MODULES) {
> +		pr_err("blog: invalid slot_id %d for module %s\n", slot_id, ctx->name);
> +		return NULL;
> +	}
> +
> +	/* Fast path: check if context already exists */
> +	tls_ctx = READ_ONCE(task->blog_contexts[slot_id]);
> +	if (tls_ctx)
> +		return tls_ctx;
> +
> +	/* Slow path: get composite from batch or allocate new one */
> +	composite = blog_batch_get(&logger->alloc_batch);
> +	if (!composite) {
> +		/* Allocate new composite via page allocator (512KB) */
> +		struct page *pages;
> +
> +		pages = alloc_pages(GFP_KERNEL | __GFP_ZERO,
> +				    get_order(BLOG_TLS_PAGEFRAG_ALLOC_SIZE));
> +		if (!pages)
> +			return NULL;
> +		composite = page_address(pages);
> +	}
> +
> +	tls_ctx = &composite->ctx;
> +
> +	/* Initialize if this is a new or uninitialized composite */
> +	if (tls_ctx->id == 0) {
> +		INIT_LIST_HEAD(&tls_ctx->list);
> +
> +		/* Get unique context ID */
> +		spin_lock(&logger->ctx_id_lock);
> +		tls_ctx->id = logger->next_ctx_id++;
> +		spin_unlock(&logger->ctx_id_lock);
> +
> +#if BLOG_DEBUG_POISON
> +		tls_ctx->debug_poison = BLOG_CTX_POISON;
> +#endif
> +	}
> +
> +	/* Initialize TLS context fields */
> +	atomic_set(&tls_ctx->refcount, 1);
> +	get_task_struct(task);
> +	tls_ctx->task = task;
> +	tls_ctx->pid = task->pid;
> +	get_task_comm(tls_ctx->comm, task);
> +	tls_ctx->base_jiffies = jiffies;
> +	tls_ctx->release = blog_module_tls_release;
> +	tls_ctx->logger = logger;
> +
> +	/* Initialize embedded pagefrag to point to inline buffer */
> +	pf = &composite->pf;
> +	pf->pages = NULL;  /* No separate pages, using inline buffer */
> +	pf->buffer = composite->buf;  /* Point to inline buffer */
> +	pf->capacity = BLOG_TLS_PAGEFRAG_BUFFER_SIZE;  /* Inline buffer size (512KB - header) */
> +	spin_lock_init(&pf->lock);
> +	pf->head = 0;
> +	pf->alloc_count = 0;
> +	pf->active_elements = 0;
> +	pf->last_entry = NULL;
> +
> +	/* Ensure context is linked into logger list exactly once */
> +	spin_lock(&logger->lock);
> +	if (list_empty(&tls_ctx->list)) {
> +		list_add(&tls_ctx->list, &logger->contexts);
> +		logger->total_contexts_allocated++;
> +	}
> +	spin_unlock(&logger->lock);
> +
> +	/* Install in task's context array - use cmpxchg for atomicity */
> +	if (cmpxchg(&task->blog_contexts[slot_id], NULL, tls_ctx) != NULL) {
> +		/* Another thread beat us to it - recycle our composite */
> +		blog_module_clear_task(tls_ctx);
> +		blog_module_queue_to_alloc_batch(logger, tls_ctx);
> +		return task->blog_contexts[slot_id];
> +	}
> +
> +	/* Context successfully installed */
> +	atomic_inc(&ctx->allocated_contexts);
> +
> +	pr_debug("blog: created TLS context for module %s (slot %d), task %d [%s]\n",
> +		 ctx->name, slot_id, task->pid, task->comm);
> +
> +	return tls_ctx;
> +}
> +EXPORT_SYMBOL(blog_get_tls_ctx_ctx);
> +
> +/**
> + * blog_log_ctx - Log a message with module context
> + * @ctx: Module context
> + * @source_id: Source ID
> + * @client_id: Client ID
> + * @needed_size: Size needed for the log entry
> + *
> + * Return: Buffer to write log data to, or NULL on failure
> + */
> +void *blog_log_ctx(struct blog_module_context *ctx, u32 source_id,
> +		   u8 client_id, size_t needed_size)
> +{
> +	struct blog_tls_ctx *tls_ctx;
> +	struct blog_pagefrag *pf;
> +	struct blog_log_entry *entry;
> +	int alloc;
> +	size_t total_size;
> +
> +	if (!ctx || !ctx->logger)
> +		return NULL;
> +
> +	/* Get TLS context */
> +	tls_ctx = blog_get_tls_ctx_ctx(ctx);
> +	if (!tls_ctx)
> +		return NULL;
> +
> +	/* Validate payload size */
> +	if (needed_size > BLOG_MAX_PAYLOAD) {
> +		pr_warn_once("BLOG: payload too large (%zu > %u) in module %s\n",
> +			     needed_size, BLOG_MAX_PAYLOAD, ctx->name);
> +		return NULL;
> +	}
> +
> +	/* Calculate total size needed (with alignment) */
> +	total_size = round_up(sizeof(*entry) + needed_size, 8);

Ditto. Hardcoded 8 value? Maybe, named constant?

> +
> +	/* Get pagefrag from composite */
> +	pf = blog_ctx_pf(tls_ctx);
> +
> +	/* Reserve space from pagefrag */
> +	alloc = blog_pagefrag_reserve(pf, total_size);
> +	if (alloc == -ENOMEM) {
> +		pr_debug("%s: allocation failed for module %s\n",
> +		       __func__, ctx->name);

The pr_debug() or pr_err()? We have failed here.

> +		blog_pagefrag_reset(pf);
> +		return NULL;
> +	}
> +
> +	/* Get pointer from allocation */
> +	entry = blog_pagefrag_get_ptr(pf, alloc);
> +	if (!entry) {
> +		pr_err("%s: failed to get pointer from pagefrag\n", __func__);
> +		return NULL;
> +	}
> +
> +	/* Store pending publish info for blog_log_commit_ctx() */
> +	tls_ctx->pending_offset = alloc;
> +	tls_ctx->pending_size = total_size;
> +
> +	/* Fill in entry header */
> +#if BLOG_DEBUG_POISON
> +	entry->debug_poison = BLOG_LOG_ENTRY_POISON;
> +#endif
> +	entry->ts_delta = jiffies - tls_ctx->base_jiffies;
> +	entry->source_id = source_id;
> +	entry->len = (u8)needed_size;

Could needed_size overflow here?

> +	entry->client_id = client_id;
> +	entry->flags = 0;
> +
> +	/* Return pointer to buffer area */
> +	return entry->buffer;
> +}
> +EXPORT_SYMBOL(blog_log_ctx);
> +
> +/**
> + * blog_log_commit_ctx - Publish a reserved log entry
> + * @ctx: Module context
> + * @actual_size: Actual bytes written during serialization
> + *
> + * Publishes the log entry that was reserved by the last blog_log_ctx() call,
> + * making it visible to readers. Must be called after serialization is complete.
> + *
> + * Context: Same context as the preceding blog_log_ctx() call
> + * Return: 0 on success, negative error code on failure
> + */
> +int blog_log_commit_ctx(struct blog_module_context *ctx, size_t actual_size)
> +{
> +	struct blog_tls_ctx *tls_ctx;
> +	struct blog_pagefrag *pf;
> +	struct blog_log_entry *entry;
> +	size_t total_size;
> +
> +	if (!ctx || !ctx->logger)
> +		return -EINVAL;
> +
> +	tls_ctx = blog_get_ctx_ctx(ctx);
> +	if (!tls_ctx)
> +		return -EINVAL;
> +
> +	pf = blog_ctx_pf(tls_ctx);
> +
> +	/* Get the entry we're committing and update its length to actual size */
> +	entry = blog_pagefrag_get_ptr(pf, tls_ctx->pending_offset);
> +	if (entry)
> +		entry->len = (u8)actual_size;

Ditto. Could actual_size overflow here?

> +
> +	/* actual_size is payload only, need to add header + alignment */
> +	total_size = round_up(sizeof(struct blog_log_entry) + actual_size, 8);

Ditto. Maybe, named constant instead of hardcoded 8 value?

> +
> +	pr_err("blog_log_commit_ctx: pending_offset=%d, actual_size=%zu, total_size=%zu, pending_size=%zu\n",
> +		tls_ctx->pending_offset, actual_size, total_size, tls_ctx->pending_size);

The pr_err() or pr_debug()?

Thanks,
Slava.

> +
> +	blog_pagefrag_publish(pf, tls_ctx->pending_offset + total_size);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(blog_log_commit_ctx);
> +
> +/**
> + * blog_get_ctx_ctx - Get appropriate context based on execution context
> + * @ctx: Module context
> + *
> + * Return: TLS context or NAPI context depending on execution context
> + */
> +struct blog_tls_ctx *blog_get_ctx_ctx(struct blog_module_context *ctx)
> +{
> +	if (in_serving_softirq())
> +		return blog_get_napi_ctx_ctx(ctx);
> +	return blog_get_tls_ctx_ctx(ctx);
> +}
> +EXPORT_SYMBOL(blog_get_ctx_ctx);
> +
> +/**
> + * blog_get_napi_ctx_ctx - Get NAPI context for current CPU
> + * @ctx: Module context
> + *
> + * Return: NAPI context or NULL
> + */
> +struct blog_tls_ctx *blog_get_napi_ctx_ctx(struct blog_module_context *ctx)
> +{
> +	struct blog_logger *logger;
> +	struct blog_tls_ctx **napi_ctx_ptr;
> +
> +	if (!ctx || !ctx->logger)
> +		return NULL;
> +
> +	logger = ctx->logger;
> +	if (!logger->napi_ctxs)
> +		return NULL;
> +
> +	/* Get pointer to the percpu pointer */
> +	napi_ctx_ptr = per_cpu_ptr(logger->napi_ctxs, smp_processor_id());
> +	return *napi_ctx_ptr;
> +}
> +EXPORT_SYMBOL(blog_get_napi_ctx_ctx);
> +
> +/**
> + * blog_set_napi_ctx_ctx - Set NAPI context for current CPU
> + * @ctx: Module context
> + * @tls_ctx: TLS context to set
> + */
> +void blog_set_napi_ctx_ctx(struct blog_module_context *ctx, struct blog_tls_ctx *tls_ctx)
> +{
> +	struct blog_logger *logger;
> +	struct blog_tls_ctx **napi_ctx_ptr;
> +
> +	if (!ctx || !ctx->logger || !ctx->logger->napi_ctxs)
> +		return;
> +
> +	logger = ctx->logger;
> +	/* Get pointer to the percpu pointer and set it */
> +	napi_ctx_ptr = per_cpu_ptr(logger->napi_ctxs, smp_processor_id());
> +	*napi_ctx_ptr = tls_ctx;
> +}
> +EXPORT_SYMBOL(blog_set_napi_ctx_ctx);
> diff --git a/lib/blog/blog_pagefrag.c b/lib/blog/blog_pagefrag.c
> new file mode 100644
> index 000000000000..598de9145601
> --- /dev/null
> +++ b/lib/blog/blog_pagefrag.c
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Binary Logging Page Fragment Management
> + *
> + * Migrated from ceph_san_pagefrag.c with all algorithms preserved
> + */
> +
> +#include <linux/printk.h>
> +#include <linux/slab.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/blog/blog_pagefrag.h>
> +
> +/**
> + * blog_pagefrag_init_with_buffer - Initialize pagefrag with an existing buffer
> + * @pf: pagefrag allocator to initialize
> + * @buffer: pre-allocated buffer to use
> + * @size: size of the buffer
> + *
> + * Return: 0 on success
> + */
> +int blog_pagefrag_init_with_buffer(struct blog_pagefrag *pf, void *buffer, size_t size)
> +{
> +	spin_lock_init(&pf->lock);
> +	pf->pages = NULL; /* No pages allocated, using provided buffer */
> +	pf->buffer = buffer;
> +	pf->capacity = size;  /* Store size for bounds checking */
> +	pf->head = 0;
> +	pf->active_elements = 0;
> +	pf->alloc_count = 0;
> +	pf->last_entry = NULL;
> +	return 0;
> +}
> +EXPORT_SYMBOL(blog_pagefrag_init_with_buffer);
> +
> +/**
> + * blog_pagefrag_reserve - Reserve space in the pagefrag buffer
> + * @pf: pagefrag allocator
> + * @n: number of bytes to reserve
> + *
> + * Checks if there is sufficient space and returns the current head offset
> + * WITHOUT advancing the head pointer. This allows the caller to write data
> + * before making it visible via blog_pagefrag_publish().
> + *
> + * This is lockless - only one writer per pagefrag (per-task context).
> + *
> + * Return: offset to reserved memory, or negative error if not enough space
> + */
> +int blog_pagefrag_reserve(struct blog_pagefrag *pf, unsigned int n)
> +{
> +	if (pf->head + n > pf->capacity)
> +		return -ENOMEM; /* No space left */
> +	return pf->head;  /* Return offset without advancing */
> +}
> +EXPORT_SYMBOL(blog_pagefrag_reserve);
> +
> +/**
> + * blog_pagefrag_publish - Publish reserved space by advancing head pointer
> + * @pf: pagefrag allocator
> + * @publish_head: new head value (offset + bytes_written)
> + *
> + * Atomically advances the head pointer to make previously written data visible
> + * to readers. Must be called after blog_pagefrag_reserve() and writing data.
> + *
> + * Uses memory barrier to ensure all writes are visible before head is updated.
> + * This prevents readers from seeing partially-written entries.
> + *
> + * This is lockless - only one writer per pagefrag (per-task context).
> + */
> +void blog_pagefrag_publish(struct blog_pagefrag *pf, u64 publish_head)
> +{
> +	/* Ensure all prior writes are visible before updating head */
> +	smp_wmb();
> +
> +	/* Atomically update head to make data visible to readers */
> +	pf->head = publish_head;
> +	pf->alloc_count++;
> +	pf->active_elements++;
> +}
> +EXPORT_SYMBOL(blog_pagefrag_publish);
> +
> +/**
> + * blog_pagefrag_get_ptr - Get buffer pointer from pagefrag reserve result
> + * @pf: pagefrag allocator
> + * @val: return value from blog_pagefrag_reserve
> + *
> + * Return: pointer to reserved buffer region
> + */
> +void *blog_pagefrag_get_ptr(struct blog_pagefrag *pf, u64 val)
> +{
> +	void *rc = (void *)(pf->buffer + val);
> +
> +	if (unlikely(pf->pages && pf->buffer != page_address(pf->pages))) {
> +		pr_err("%s: invalid buffer pointer %llx @ %s\n", __func__,
> +		       (unsigned long long)pf->buffer, current->comm);
> +		WARN_ON_ONCE(1);
> +		return NULL;
> +	}
> +	if (unlikely((rc) < pf->buffer || (rc) >= (pf->buffer + pf->capacity))) {
> +		pr_err("%s: invalid pointer %llx\n", __func__,
> +		       (unsigned long long)rc);
> +		WARN_ON_ONCE(1);
> +		return NULL;
> +	}
> +	return rc;
> +}
> +EXPORT_SYMBOL(blog_pagefrag_get_ptr);
> +
> +/**
> + * blog_pagefrag_reset - Reset the pagefrag allocator.
> + *
> + * Resets the head and tail pointers to the beginning of the buffer.
> + */
> +void blog_pagefrag_reset(struct blog_pagefrag *pf)
> +{
> +	spin_lock(&pf->lock);
> +	pf->head = 0;
> +	pf->active_elements = 0;
> +	pf->alloc_count = 0;
> +	pf->last_entry = NULL;
> +	spin_unlock(&pf->lock);
> +}
> +EXPORT_SYMBOL(blog_pagefrag_reset);
> +

next prev parent reply	other threads:[~2025-10-30 18:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-24  8:42 [RFC PATCH 0/5] BLOG: per-task logging contexts with Ceph consumer Alex Markuze
2025-10-24  8:42 ` [RFC PATCH 1/5] sched, fork: Wire BLOG contexts into task lifecycle Alex Markuze
2025-10-24 17:44   ` Steven Rostedt
2025-10-29 18:57   ` Viacheslav Dubeyko
2025-10-24  8:42 ` [RFC PATCH 2/5] lib: Introduce BLOG (Binary LOGging) subsystem Alex Markuze
2025-10-30 18:47   ` Viacheslav Dubeyko [this message]
2025-10-24  8:42 ` [RFC PATCH 3/5] ceph: Add BLOG scaffolding Alex Markuze
2025-11-03 22:37   ` Viacheslav Dubeyko
2025-10-24  8:42 ` [RFC PATCH 4/5] ceph: Add BLOG debugfs support Alex Markuze
2025-11-03 21:07   ` Viacheslav Dubeyko
2025-10-24  8:42 ` [RFC PATCH 5/5] ceph: Activate BLOG logging Alex Markuze
2025-11-03 21:00   ` Viacheslav Dubeyko
2025-10-24 15:32 ` [RFC PATCH 0/5] BLOG: per-task logging contexts with Ceph consumer David Hildenbrand
2025-10-24 17:53 ` Steven Rostedt
2025-10-25 10:50   ` Alex Markuze
2025-10-25 14:59     ` Steven Rostedt
2025-10-25 17:54       ` Alex Markuze
2025-10-27 14:54         ` Steven Rostedt
2025-10-28 17:07 ` Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9dc0107937fe161e14b1353722060fde02637d24.camel@ibm.com \
    --to=slava.dubeyko@ibm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=amarkuze@redhat.com \
    --cc=bsegall@google.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=david@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=idryomov@gmail.com \
    --cc=juri.lelli@redhat.com \
    --cc=kees@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).