linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC V2] IMA Log Snapshotting Design Proposal
@ 2023-10-19 18:49 Tushar Sugandhi
  2023-10-31 18:37 ` Ken Goldman
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Tushar Sugandhi @ 2023-10-19 18:49 UTC (permalink / raw)
  To: linux-integrity, Mimi Zohar, peterhuewe, Jarkko Sakkinen, jgg,
	Ken Goldman, bhe, vgoyal, Dave Young, kexec@lists.infradead.org,
	jmorris, Paul Moore, serge, James Bottomley,
	linux-security-module
  Cc: Tyler Hicks, Lakshmi Ramasubramanian, Sush Shringarputale

=======================================================================
| Introduction                                                        |
=======================================================================
This document provides a detailed overview of the proposed Kernel
feature IMA log snapshotting.  It describes the motivation behind the
proposal, the problem to be solved, a detailed solution design with
examples, and describes the changes to be made in the clients/services
which are part of remote-attestation system.  This is the 2nd version
of the proposal.  The first version is present here[1].

Table of Contents:
------------------
A. Motivation and Background
B. Goals and Non-Goals
     B.1 Goals
     B.2 Non-Goals
C. Proposed Solution
     C.1 Solution Summary
     C.2 High-level Work-flow
D. Detailed Design
     D.1 Snapshot Aggregate Event
     D.2 Snapshot Triggering Mechanism
     D.3 Choosing A Persistent Storage Location For Snapshots
     D.4 Remote-Attestation Client/Service-side Changes
         D.4.a Client-side Changes
         D.4.b Service-side Changes
E. Example Walk-through
F. Other Design Considerations
G. References

Change Log:
-----------
This RFC proposal doc has the following changes compared to its first
version[1]
     - Added table of contents for easy navigation through the doc.
     - Clearly defined problem and added goals and non-goals this
       proposal plans to achieve.
     - Added diagrams in section "C.2 High-level Work-flow" to help
       the audience better understand the steps proposed.
     - Provided more clarity on changes needed in attestation client and
       remote attestation service to benefit from this feature.
     - Made the necessary changes to the doc based on the new learnings
       from the investigations done in the past few months.
     - Added section "F. Other Design Considerations" to address various
       concerns brought up in the first version[1] of this doc on the
       following topics:
         o TPM-PCR update counter
         o EK/AIK Public Cert
         o implications of exporting and removing records from the IMA
           measurement list
         o attestation client restarts and the remote attestation
           service being stateless
         o using 'tmpfs' to store the snapshots
         o TPM seal-unseal scenario
=======================================================================
| A. Motivation and Background                                        |
=======================================================================
Depending on the IMA policy, the IMA log can consume a lot of Kernel
memory on the system.  For instance, the events for the following IMA
policy entries may need to be measured in certain scenarios, but they
can also lead to a verbose IMA log when the system is running for a
long period of time.
┌───────────────────────────────────────┐
│# PROC_SUPER_MAGIC                     │
│measure fsmagic=0x9fa0                 │
│# SYSFS_MAGIC                          │
│measure fsmagic=0x62656572             │
│# DEBUGFS_MAGIC                        │
│measure fsmagic=0x64626720             │
│# TMPFS_MAGIC                          │
│measure fsmagic=0x01021994             │
│# RAMFS_MAGIC                          │
│measure fsmagic=0x858458f6             │
│# SECURITYFS_MAGIC                     │
│measure fsmagic=0x73636673             │
│# OVERLAYFS_MAGIC                      │
│measure fsmagic=0x794c7630             │
│# log, audit or tmp files              │
│measure obj_type=var_log_t             │
│measure obj_type=auditd_log_t          │
│measure obj_type=tmp_t                 │
└───────────────────────────────────────┘

Secondly, certain systems are configured to take Kernel updates using
Kexec soft-boot.  The IMA log from the previous Kernel gets carried
over and the Kernel memory consumption problem worsens when such
systems undergo multiple Kexec soft-boots over a long period of time.

The above two scenarios can cause IMA log to consume significant memory
on the system.

In addition, processing a larger IMA log on the Remote-Attestation
service-side is both time consuming and inefficient - because majority
of the events would be already attested in the previous attestation
requests.

To solve this problem, putting a cap on the in-memory IMA log, or
truncating IMA log to reclaim memory are not practical solutions.
Putting a cap would result in events not getting measured in the IMA
log, which would be a security vulnerability.  Truncating the log would
make the log go out of sync with the TPM PCR quote, resulting in the
remote attestation of the system to fail until the system goes through
a hard reboot.  Therefore, both these solutions are unacceptable.

A sophisticated solution is required in this case which will help
reduce the memory pressure on the system and continue supporting remote
attestation for a longer duration without disruptions.

=======================================================================
| B. Goals and Non-Goals                                              |
=======================================================================
-----------------------------------------------------------------------
| B.1 Goals                                                           |
-----------------------------------------------------------------------
To address the issues described in the section above, we propose
enhancements to the IMA subsystem to achieve the following goals:

  a. Reduce memory pressure on the Kernel caused by larger in-memory
     IMA logs.

  b. Preserve the system's ability to get remotely attested using the
     IMA log, even after implementing the enhancements to reduce memory
     pressure caused by the IMA log. IMA's Integrity guarantees should
     be maintained.

  c. Provide mechanisms from Kernel side to the remote attestation
     service to make service-side processing more efficient.

-----------------------------------------------------------------------
| B.2 Non-Goals                                                       |
-----------------------------------------------------------------------
  a. Implementing the changes needed in the remote attestation
     client/service to benefit from the proposed enhancements are out of
     scope of this proposal.  However, we will briefly discuss what needs
     to be done in that space.

=======================================================================
| C. Proposed Solution                                                |
=======================================================================

This section provides high level summary of the proposed solution and
the necessary steps in the work-flow.  The details of each aspect of
the solution and the alternate approaches considered etc. are discussed
in section "D. Detailed Design".

-----------------------------------------------------------------------
| C.1 Solution Summary                                                |
-----------------------------------------------------------------------
To achieve the goals described in the section above, we propose the
following changes to the IMA subsystem.

     a. The IMA log from Kernel memory will be offloaded to some
        persistent storage disk to keep the system running reliably
        without facing memory pressure.
        More details, alternate approaches considered etc. are present
        in section "D.3 Choices for Storing Snapshots" below.

     b. The IMA log will be divided into multiple chunks (snapshots).
        Each snapshot would be a delta between the two instances when
        the log was offloaded from memory to the persistent storage
        disk.

     c. Some UM process (like a remote-attestation-client) will be
        responsible for writing the IMA log snapshot to the disk.

     d. The same UM process would be responsible for triggering the IMA
        log snapshot.

     e. There will be a well-known location for storing the IMA log
        snapshots on the disk.  It will be non-trivial for UM processes
        to change that location after booting into the Kernel.

     f. A new event, "snapshot_aggregate", will be computed and measured
        in the IMA log as part of this feature.  It should help the
        remote-attestation client/service to benefit from the IMA log
        snapshot feature.
        The "snapshot_aggregate" event is described in more details in
        section "D.1 Snapshot Aggregate Event" below.

     g. If the existing remote-attestation client/services do not change
        to benefit from this feature or do not trigger the snapshot,
        the Kernel will continue to have it's current functionality of
        maintaining an in-memory full IMA log.

Additionally, the remote-attestation client/services need to be updated
to benefit from the IMA log snapshot feature.  These proposed changes
are described in section "D.4 Remote-Attestation Client/Service Side
Changes" below, but their implementation is out of scope for this
proposal.

-----------------------------------------------------------------------
| C.2 High-level Work-flow                                            |
-----------------------------------------------------------------------
This section describes the steps to take a snapshot of the IMA log.

The proposed high level work-flow of IMA log snapshotting is as
follows:
     a. A user-mode process will trigger the snapshot by opening a file
        in SysFS say /sys/kernel/security/ima/snapshot (referred to as
        "sysk_ima_snapshot_file" here onwards).
        See Step #a in Diagram #1 below.

     b. The Kernel will get the current TPM PCR values and store them as
        template data in a new IMA event "snapshot_aggregate".
        This event will be measured by IMA using critical data
        measurement functionality[2].
        Measuring the "snapshot_aggregate" will be an atomic operation
        similar to any other IMA log measurement.
        See Step #b and #c in Diagram #1 below.

     c. Once the "snapshot_aggregate" is computed and measured in IMA
        log, the prior IMA events will be made available in the
        "sysk_ima_snapshot_file".
        See Step #b and #c in Diagram #1 below.

                                Diagram #1
                                ----------
            Step #a                                   Step #b and #c
           ---------                                 ----------------
     (In-memory IMA log)                         (In-memory IMA log)
    .--------------------.                     .----------------------.
    | Event #E1          |                     | Event #E1            |
    | Event #E2          |                     | Event #E2            |
    |                    |                     |                      |
    |                    |                     | "snapshot_aggregate" |
    |                    |                     |   (#E1+#E2)          |
    '--------------------'                     '----------------------'
              ^                                            ^
              |                                            |
              |                                            |
KM         *Kernel*                step #b.Kernel writes  |
---          |                       "snapshot_aggregate" |
UM           |                                to IMA log  |
              |                                            |
              |                    step #c.Kernel writes   |
              |                    the events E2 and E3 to |
              |                     sysk_ima_snapshot_file |
              V                                            V
    (sysk_ima_snapshot_file)                   (sysk_ima_snapshot_file)
    .--------------------.                      .--------------------.
    |                    |                      | Event #E1          |
    |       {Empty}      |        ====>         | Event #E2          |
    |                    |                      |                    |
    '--------------------'                      '--------------------'
              ^                                            ^
              |                                            |
              |Step #a Client opens                        |
              |sysk_ima_snapshot_file                      |
              |                                            |
   *Attestation Client (UM)*                   *Attestation Client (UM)*
              |                                            |
              |                                            |
              |                                            |
              V                                            V
   UM_snapshot_file  (on DISK)               UM_snapshot_file  (on DISK)
    .--------------------.                      .--------------------.
    |                    |                      |                    |
    |       {Empty}      |                      |       {Empty}      |
    |                    |                      |                    |
    '--------------------'                      '--------------------'

     d. The UM process will copy those IMA events from
        "sysk_ima_snapshot_file" to a snapshot file on disk chosen by UM
        (referred to as "UM_snapshot_file" here onwards).
        The location, file-system type, access permissions etc. of the
        "UM_snapshot_file" would be controlled by UM process itself.
        As described in section D.3, the location of "UM_snapshot_file"
        should be well-known.
        See Step #d in Diagram #2 below.

     e. Once UM is done copying the IMA events from
        "sysk_ima_snapshot_file" to "UM_snapshot_file", it will indicate
        to the Kernel that the snapshot can be finalized by triggering a
        write with any data to the "sysk_ima_snapshot_file".  UM process
        cannot prevent the IMA log purge operation after this point.
        See Step #e in Diagram #2 below.

                                Diagram #2
                                ----------
            Step #d                                   Step #e
           ---------                                 --------
     (In-memory IMA log)                         (In-memory IMA log)
    .--------------------.                     .----------------------.
    | Event #1           |                     | Event #1             |
    | Event #2           |                     | Event #2             |
    |"snapshot_aggregate"|                     | "snapshot_aggregate" |
    |   (#E1+#E2)        |                     |   (#E1+#E2)          |
    '--------------------'                     '----------------------'
              ^                                            ^
              |                                            |
              |                                            |
KM         *Kernel*                                    *Kernel*
---          |                                            |
UM           |                                            |
              |                                            |
              |                                            |
              |                                            |
              V                                            V
    (sysk_ima_snapshot_file)                   (sysk_ima_snapshot_file)
    .--------------------.                      .--------------------.
    | Event #E1          |                      | "done"             |
    | Event #E2          |        ====>         |                    |
    |                    |                      |                    |
    '--------------------'                      '--------------------'
             ^                                             ^
             |Step #d Client copies                        |
             |events to UM_snapshot_file                   |
             |                                             |
             |                                             |
             |                       step #e Client writes |
             |                        "done" to the file   |
             |                     sysk_ima_snapshot_file  |
             |                                             |
  *Attestation Client (UM)*                   *Attestation Client (UM)*
             |                                             |
             |                                             |
             |                                             |
             V                                             V
   UM_snapshot_file  (on DISK)               UM_snapshot_file  (on DISK)
    .--------------------.                      .--------------------.
    | Event #E1          |                      |Event #E1           |
    | Event #E2          |                      |Event #E2           |
    |                    |                      |                    |
    '--------------------'                      '--------------------'

     f. The Kernel will truncate the current IMA log and and clear
        HTable up to the "snapshot_aggregate" marker.
        See Step #f in Diagram #3 below.

                                Diagram #3
                                ----------
            Step #f
           ---------
      (In-memory IMA log)
    .----------------------.
    | "snapshot_aggregate" |
    | Event #E4            |
    | Event #E5            |
    '----------------------'
              ^
              |
              | Step #f
KM           | Kernel removes the old events before
---          |  "snapshot_aggregate" i.e. #E1 and #E2.
UM           | And continues to measure the new events
              | i.e. #E4 and #E5
              |
              |
              V
   (sysk_ima_snapshot_file)
    .--------------------.
    | "done"             |
    |                    |
    |                    |
    '--------------------'
              ^
              |
              |
              | Attestation Client can now close
              | the sysk_ima_snapshot_file.
              | When it is reopened by the client, a new snapshot will
              | be triggered.
              V
   UM_snapshot_file  (on DISK)
    .--------------------.
    | Event #E1          |
    | Event #E2          |
    |                    |
    '--------------------'
     g. Optionally, UM can prevent the IMA log purge by closing the
        "sysk_ima_snapshot_file" without performing a write operation
        on it.  In this case, the events in the IMA log before
        the latest "snapshot_aggregate" will not be purged.
        While the "snapshot_aggregate" marker may still remain in the
        log as an intermediate entry, it can be ignored since it will
        not interfere with the remote attestation.

     h. This work-flow should work when interleaved with Kexec 'load'
        and 'execute' events and should not cause IMA log + snapshot to
        go out of sync with PCR quotes.  The implementation details are
        omitted from this document for brevity.

=======================================================================
| D. Detailed Design                                                  |
=======================================================================

-----------------------------------------------------------------------
| D.1 Snapshot Aggregate Event                                        |
-----------------------------------------------------------------------
When the IMA log snapshot is triggered, IMA will pause the measurements
and start computing and measurement of the event "snapshot_aggregate".

The Template Data of the "snapshot_aggregate" event will have the
following grammar:
     TEMPLATE_DATA      := <Snapshot_Counter>";"<PCR_Banks>";"
     Snapshot_Counter   := "Snapshot_Attempt_Count="
                               <num. snapshot attempts>
     PCR_Banks          := <PCR_Bank>";"|<PCR_Banks> "," <PCR_Bank>";"
     PCR_Bank           := <PCR_Hash Type> ":" <PCRn>
     PCR_Hash_Type      := "sha1"|"sha256"|"sha384"
     PCRn               := "PCR"<N>":"<PCR_Hash>
     PCR_Hash           := <hash of the PCR N>
     N                  := [0-23]

The Data Hash of the "snapshot_aggregate" event log line is -
  Append(H(snapshot_file),PCR0,PCR1,...PCR23).  All available PCR banks
  will be  included in the template data in a known structure.

The events generated between the window of "triggering of
a snapshot" and "computation and measurement of the 'snapshot_aggregate'
event data" will be queued, and they will be measured in the IMA log
after the "snapshot_aggregate" event data is computed and measured.

After "snapshot_aggregate" event data is computed and measured, the UM
process will dump the events before "snapshot_aggregate" to a well
known location on persistent storage in "UM_snapshot_file".  Once the
UM process signals that work is complete, IMA will remove those events
from the IMA log.  IMA log will now have "snapshot_aggregate" as the
first event in it.

The "snapshot_aggregate" marker provides the following benefits:
     a. It facilitates the IMA log to be divided into multiple chunks
        and provides mechanism to verify the integrity of the system
        using only the latest chunks during remote attestation.

     b. It provides tangible evidence from Kernel to the attestation
        client that IMA log snapshotting has been enabled and at least
        one snapshot exists on the system.

     c. It helps both the Kernel and UM attestation client define clear
        boundaries between multiple snapshots.

     d. In the event of multiple snapshots, the last measured
        "snapshot_aggregate" marker, which is present in the current
        segment of the IMA log, has sufficient information to verify the
        integrity of the IMA log segment as well as the previous
        snapshots using the PCR quotes.

     e. In the event of multiple snapshots, say N, if the
        remote-attestation service has already processed the last N-1
        snapshots, it can efficiently parse through them by just
        processing "snapshot_aggregate" events to compute the PCR quotes
        needed to verify the events in the last snapshot.  This should
        drastically improve the IMA log processing efficiency of the
        service.

-----------------------------------------------------------------------
| D.2 Snapshot Triggering Mechanism                                   |
-----------------------------------------------------------------------
The snapshot triggering mechanism is as described below:

     a. The IMA subsystem will create a new file:
        /sys/kernel/security/ima/snapshot
       (a.k.a. sysk_ima_snapshot_file).

     b. A UM process opening this file will trigger a snapshot.
        The file will be opened exclusively, so only one UM process can
        trigger the snapshot at a time.

     c. Once the kernel has written the "snapshot_aggregate" marker to
        the IMA log, the IMA log prior to that event can be read by UM
        on the same FD.

     d. When UM writes some data to "sysk_ima_snapshot_file", the kernel
        will finalize the snapshot by purging the in-memory IMA log.

     e. If UM closes "sysk_ima_snapshot_file" without writing to it,
        the Kernel will not purge the IMA log.

     f. If a system administrator requires that only a specific client
        process should trigger the snapshot, this capability can be set
        as an SeLinux policy.

-----------------------------------------------------------------------
| D.3 Choosing A Persistent Storage Location For Snapshots            |
-----------------------------------------------------------------------
Choosing the snapshot location is handled by the UM process.  The
location should be a well-known location, potentially set in a
configuration file or the IMA policy file under /etc.  The
configuration file should be marked as read-only for UM processes.

The Kernel will wait for the UM process to indicate that the current
IMA log has been written to "UM_snapshot_file" and only then the Kernel
will truncate the in-memory IMA log.  UM is responsible to clear any
existing UM_snapshot_file(s) on system start or on a hard reboot/power
cycle.

During the first RFC, concerns were raised over the protections of the
"UM_snapshot_file".  Just like the current IMA log sent over the
network to the attestation service, the "UM_snapshot_file" file is not
resistant to modifications, neither should it be.  The TPM PCR quote
provides the guarantee that the IMA log has not been modified - the
same PCR quote continues to provide that guarantee for the snapshot
files as well.
Using the template data in the snapshot_aggregate marker, the
attestation service can validate the integrity of the data provided in
the past snapshots.  The log is validated all the way to the final
event, which is validated by the PCR quote.

-----------------------------------------------------------------------
| D.4 Remote-Attestation Client/Service-side Changes                  |
-----------------------------------------------------------------------
A remote attestation client and service will need to be aware of the
"snapshot_aggregate" marker and how it should be handled.  A typical
attestation path on the system will remain the same - send the current
IMA log along with the signed TPM PCR quotes to the remote
attestation-service.  But the attestation clients and remote services
need to be aware that they need to use "UM_snapshot_file(s)" too,
along with PCR quotes and the data in "snapshot_aggregate" event in IMA
log to re-establish the chain of trust if the feature is enabled and a
snapshot is taken.

As mentioned above, if the existing remote-attestation client/services
do not change to benefit from the log snapshotting feature, or do not
trigger the snapshot, the Kernel will continue to have it's current
functionality of maintaining an in-memory full IMA log.

   D.4.a Client-side changes
-----------------------------------------------------------------------
To use snapshot feature, the attestation client is required to know
     - if snapshotting is supported/enabled by the Kernel
     - if snapshotting is supported/enabled by the remote-attestation
       service it interacts with
     - the correct order of UM_snapshot_file(s) for backtracking through
       the IMA log

The protocol between the remote-attestation client and service needs to
be updated to send previous snapshots to the service as requested.

If the service is not yet ready to process the "snapshot_aggregate"
events and the snapshots, then the clients should not write any data to
the sysk_ima_snapshot_file before closing it.  This will add the
"snapshot_aggregate" marker to the IMA log without purging the log.
Once the service implements "snapshot_aggregate" event parsing, the
client can implement the functionality to  write to the
"sysk_ima_snapshot_file" each time it triggers the snapshot.  This
would indicate to the Kernel that the log should be purged.

   D.4.b Service-side changes
-----------------------------------------------------------------------
For the remote attestation to work, the service will need to know how
to validate the "snapshot_aggregate" entry in the IMA log.  It will
have to read the PCR values present in the template data of the
"snapshot_aggregate" event in the latest IMA log segment, and ensure
that the PCR quotes align with the contents of the past
UM_snapshot_file(s).  This will re-establish the chain of trust needed
for the system to pass remote attestation.  This will also maintain the
ability of the remote-attestation service to seal the secrets, if the
client-server use TPM seal/unseal mechanism to attest the state of the
system.

The client-service protocol will need an implementation for requesting
previous snapshots of the IMA logs.  There may be various scenarios
when such a request is made such as:
     - the service is stateless and requires all the data since boot
       for evaluating if the attestation should succeed or not.
     - there are linked IMA events that are split across a snapshot
       boundary
     - the expected event that the service is looking for is not present
       in the current list of IMA log + UM_snapshot_file(s).

The service will need to request the client to send the old
UM_snapshot_file(s) and ensure that the log replay still generates the
expected PCR values provided in the quote.

To avoid asking for previous snapshot chunks from the system, the
service may maintain the past attestation status of the system at a
given snapshot checkpoint.  This would make the service more efficient
in processing IMA logs for attestation.  It will reduce the persistent
storage space requirements on the client.  But it would require the
service to be stateful.  It's a trade-off which needs to be evaluated
by the owners of the remote attestation service.

If the clients are not yet updated to trigger snapshots, they will
still be sending the IMA log in its entirety without any
"snapshot_aggregate" event recorded in the log.  Processing such IMA
logs is an existing service behavior, and the service should continue
supporting it until all the clients are updated to take IMA log
snapshots.  The service can use the presence of the
"snapshot_aggregate" event in the IMA log to determine and track which
clients in the fleet have the capability to generate IMA log snapshot.

=======================================================================
| E. Example Walk-through                                             |
=======================================================================
This section provides an example walk-through of the IMA log snapshot
capturing scenario.  Assume a system has below IMA log before a
snapshot is taken.
┌───┬─────────────┬────────┬────────────┬────────────────┬────────────┐
│PCR│Template Hash│Template│ Data Hash  │   Data File    │TemplateData│
├───┼─────────────┼────────┼────────────┼────────────────┼────────────┤
│10 │322a847385...│ima-sig │sha256:309..│boot_aggregate  │            │
├───┼─────────────┼────────┼────────────┼────────────────┼────────────┤
│10 │92dbf55061...│ima-buf │sha256:b93..│kernel_version..│352342347...│
├───┼─────────────┼────────┼────────────┼────────────────┼────────────┤
│11 │e8e12d9532...│ima-buf │sha256:cd0..│dm_table_load.. │568956899...│
├───┼─────────────┼────────┼────────────┼────────────────┼────────────┤
│12 │e8e12d9532...│ima-sig │sha256:560..│/usr/lib/modul..│            │
└───┴─────────────┴────────┴────────────┴────────────────┴────────────┘

For this example, the PCR values in sha256 bank are as follows:

     sha256:
         0 : 0xDA009CB9DDBF2DF2...8D3CAE516A99847262790479368F82B6
         ...
         ...
         10: 0xB93EBF68FC66C6B6...303BC0D5DA0B419F059DE27EAE3BAA29
         11: 0xCD00ABB3D84DB0F0...01A1ADCDCADB15DEED47BA6FCE40D420
         12: 0x560153FB6A0CC603...892BB48772682F692E44A0393281DB45
         ...
         ...
         23: 0x0000000000000000...00000000000000000000000000000000

If a snapshot is taken at this point, the current IMA log will be
written to the disk.  The events generated between the window of
"triggering of a snapshot" and "computation and measurement of the
snapshot_aggregate event data" will be queued, and they will be
measured in the IMA log after the "snapshot_aggregate" event data is
computed and measured.

The state of the IMA log after the snapshot:
┌───┬─────────┬────────┬───────────┬──────────────────┬───────────────┐
│PCR│Template │Template│ Data Hash │     Data File    │ Template      │
│   │ Hash    │        │           │                  │   Data        │
├───┼─────────┼────────┼───────────┼──────────────────┼───────────────┤
│10 │e55cba...│ima-buf │Sha256:30a.│snapshot_aggregate│<TEMPLATE_DATA>│
│   │         │        │           │                  │see the grammar│
│   │         │        │           │                  │   below       │
└───┴─────────┴────────┴───────────┴──────────────────┴───────────────┘

  where the Data Hash of the log line is -
  Append(H(snapshot_file),PCR0,PCR1,...PCR23).  All available PCR banks
  will be  included in the template data in a known structure.

The Template Data will follow the grammar below:
     TEMPLATE_DATA      := <Snapshot_Counter>";"<PCR_Banks>";"
     Snapshot_Counter   := "Snapshot_Attempt_Count="
                               <num. snapshot attempts>
     PCR_Banks          := <PCR_Bank>";"|<PCR_Banks> "," <PCR_Bank>";"
     PCR_Bank           := <PCR_Hash Type> ":" <PCRn>
     PCR_Hash_Type      := "sha1"|"sha256"|"sha384"
     PCRn               := "PCR"<N>":"<PCR_Hash>
     PCR_Hash           := <hash of the PCR N>
     N                  := [0-23]

The state of the PCRs after the snapshot:
     (Only PCR 10 will change, since "snapshot_aggregate" is extended
      in that PCR.)
     sha256:
         0 : 0xDA009CB9DDBF2DF2...8D3CAE516A99847262790479368F82B6
         ...
         ...
         10 : 0x30A1B69F09A9599...2518394A8E1A3B8D343D4458E6FB0B04
         11 : 0xCD00ABB3D84DB0F...01A1ADCDCADB15DEED47BA6FCE40D420
         12 : 0x560153FB6A0CC60...892BB48772682F692E44A0393281DB45
         ...
         ...
         23: 0x0000000000000000...00000000000000000000000000000000

An example of "snapshot_aggregate" template data is given below.
10 e55cba... ima-buf  Sha256:30a... snapshot_aggregate
     Snapshot_Attempt_Count=7;
     sha256:PCR0:0xDA009CB9DDBF2DF2...8D3CAE516A99847262790479368F82B6,
     sha256:PCR1:0x30A1B69F09A9599E...2518394A8E1A3B8D343D4458E6FB0B04,
     ...
     sha256:PCR23:0x30A1B69F09A9599...2518394A8E1A3B8D343D4458E6FB0B04;

Future IMA events can then continue to be extended in the assigned PCRs
and added in the IMA log after snapshot_aggregate.  Since IMA
measurements extend the TPM-PCRs and computing snapshot_aggregate
involves reading TPM PCR banks, IMA measurements must be suspended
until the snapshot_aggregate is computed and measured.  Otherwise,
these two operations may interfere with each other compromising the
integrity of the system.

The remote-attestation-service can verify the contents of the past
(N-1) UM_snapshot_file(s) by replaying the events in them and comparing
them with the PCR values stored in template data of the first
"snapshot_aggregate" event of subsequent IMA log.

=======================================================================
| F. Other Design Considerations                                      |
=======================================================================
     a. In v1 of this RFC proposal , it was discussed [5] if we should
        use TPM-PCR update counter in the "snapshot_aggregate" event.
            => After the initial investigation and prototyping[3], we
               discovered the TPM PCR counter gets incremented when any
               of the PCRs in the PCR bank gets updated.  The counter is
               not specific to any single PCR (e.g. PCR 10 where IMA
               extends the measurements by default).
               Adding TPM PCR counter to "snapshot_aggregate" event
               won't provide any additional benefits, therefore we do
               not plan to include it in any of the IMA log
               measurements.

     b. In v1 of this RFC proposal , it was discussed [6] if we should
        add the EK/AIK Public Cert as part of "snapshot_aggregate" event
        data.
            => In a typical remote attestation scenario that does not
               use IMA log snapshotting, the EK/AIK public cert
               typically needs to be sent to the remote attestation
               service along with IMA log and TPM PCR quotes.  The
               EK/AIK public cert verifies that the TPM PCR quotes are
               signed by a genuine TPM, which in turn verifies that the
               IMA log is not tampered with.  Since the EK/AIK public
               cert is anyways sent outside of the IMA log in this
               scenario, it is not needed to be sent as part of IMA log
               again.  In other scenarios (e.g. seal/unseal, and some
               cases fTPM/vTPM), the EK/AIK public cert is either
               unavailable or is not used for remote attestation.
               Therefore we will not be adding the EK/AIK public cert as
               part of "snapshot_aggregate" event data.

     c. In one of the previous attempts so to export the IMA log [4],
        several aspects of the problem were discussed.
            o "The implications of exporting and removing records from
              the IMA measurement list needs to be considered carefully"
                  => We are addressing that in this design by
                     introducing the IMA event "snapshot_aggregate", a
                     well known location to store them, coordination
                     between the Kernel and the UM process to trigger
                     and capture the snapshot, and providing guidance to
                     the client/service to process the events.
            o Conflating the two different use cases - i.e. both UM and
              the Kernel requesting exporting and removing of the
              measurement list records.
                  => We are addressing that in this design by allowing
                     only UM to request the snapshot, and store it on
                     the well known persistent storage.  Whereas only
                     Kernel being responsible for removing the exported
                     events from the Kernel memory.
                     The responsibilities of the Kernel and UM client are
                     clearly defined, and they are distinct from each
                     other.
            o The feedback from the IMA maintainer on exporting the IMA
              log [4] was that the user-space would be responsible for
              safely storing the measurements.  The kernel would only be
              responsible for limiting permission, perhaps based on a
              capability, before removing records from the measurement
              list.
                  => We have incorporated this feedback in our design.
                      - The UM process triggers the snapshot.
                      - The UM process is responsible for storing the
                        snapshot on the persistent storage.
                      - The Kernel co-ordinates with the UM process
                        before removing the records from the measurement
                        list.

     d. In v1 of this RFC proposal, some clarification was needed to
        address the attestation client restarts and the remote
        attestation service being stateless[7].
            => As mentioned in the "Non-Goals" section above,
               implementing the changes needed in the remote attestation
               client/service are out of scope for this document.
               However, taking the IMA log snapshots should not
               interfere with the ability of the client to restart at
               any point.  Even if the client restarts and does not write
               any data to "sysk_ima_snapshot_file" after recording the
               "snapshot_aggregate" event, IMA will not purge the records
               from the IMA log.  There is no data loss, and the system
               overall keeps the ability to attest itself remotely.
               Similarly, the remote-attestation-service being state-less
               or not is orthogonal to the IMA log snapshot feature.
               As discussed in detail in section D.4.b above, that
               decision should be made by the service owner after
               weighing the pros and cons.

     e. In v1 of this RFC proposal, there was discussion on using 'tmpfs'
        as a location to store the snapshot.  Even though 'tmpfs' may
        simplify some aspects of implementing the snapshotting feature,
        it doesn't really help achieving the main goal of the feature -
        i.e. reducing the memory pressure on the Kernel.
        This is because:
           - The 'tmpfs' is part of the system's memory and not
             persistent storage[8].  Moving part of IMA log to 'tmpfs'
             doesn't really address the memory pressure problem we are
             trying to solve.
           - The 'tmpfs' is an ephemeral storage.  It's contents are lost
             on reboots.  Therefore, during kexec soft-boots, the
             contents would have to be brought back to the main memory
             so that they can be mapped to the memory of the next kernel.
             Otherwise, remote attestation would fail post kexec soft
             reboot.
        Because of the above reasons, we will not store snapshots on
        'tmpfs'.

     f. In v1 of this RFC proposal, there was discussion about supporting
        TPM seal-unseal scenario in the context of IMA log snapshots[9].
        In a typical seal-unseal scenario, some secret is sealed using
        the TPM-PCR quotes on the service-side.  The service may check
        the IMA log have expected contents before sealing the secret.
        The TPM on the client-side can only unseal the secret if the PCR
        quotes are not changed since the time secret was sealed.  IMA log
        snapshots as a feature does not alter this process of
        seal-unseal.  Therefore it should not impact the core process of
        seal-unseal.  Although, as mentioned in section D.4, changes are
        needed in client/service side functionality to request snapshots
        to verify the contents of IMA log which are not present in the
        latest log segment.

We appreciate your comments on this proposal.

=======================================================================
| G. References                                                       |
=======================================================================
[1] [RFC] IMA Log Snapshotting Design Proposal (V1)
https://lore.kernel.org/linux-integrity/c5737141-7827-1c83-ab38-0119dcfea485@linux.microsoft.com/T/#m35e7d885db365b0aac4382830b39d446eb5efdb5

[2] IMA: support for measuring kernel integrity critical data
https://patchwork.kernel.org/project/linux-integrity/cover/20210108040708.8389-1-tusharsu@linux.microsoft.com/

[3] Measuring TPM update counter in IMA
https://patchwork.kernel.org/project/linux-integrity/cover/20230801181917.8535-1-tusharsu@linux.microsoft.com/

[4] Re: [PATCH v2] ima: export the measurement list when needed
https://lore.kernel.org/linux-integrity/1580998432.5585.411.camel@linux.ibm.com/

[5] TPM-PCR Update counters in IMA log snapshots
https://lore.kernel.org/linux-integrity/c5737141-7827-1c83-ab38-0119dcfea485@linux.microsoft.com/T/#m775b8b092e06bed4694b418e9d642c4d2642ee8f

[6] Adding EK/AIK Public Cert to the IMA log snapshots
https://lore.kernel.org/linux-integrity/c5737141-7827-1c83-ab38-0119dcfea485@linux.microsoft.com/T/#m48ec6d9ceaca40030604b2336f0ef0ada8d39e6a
https://lore.kernel.org/linux-integrity/c5737141-7827-1c83-ab38-0119dcfea485@linux.microsoft.com/T/#m41081713da1ca15867b623bf7d2ffb822c1fe3fd
https://lore.kernel.org/linux-integrity/c5737141-7827-1c83-ab38-0119dcfea485@linux.microsoft.com/T/#mb8bc2398ce4d2d71bce539776169e53b9f33c8d8

[7] About Client restarts and remote attestation service being
       stateless
https://lore.kernel.org/linux-integrity/c5737141-7827-1c83-ab38-0119dcfea485@linux.microsoft.com/T/#m33fc4005b94e6fab79153800a06af40477c3be65 


[8] Tmpfs
https://www.kernel.org/doc/html/latest/filesystems/tmpfs.html

[9] TPM Seal-Unseal
https://lore.kernel.org/linux-integrity/c5737141-7827-1c83-ab38-0119dcfea485@linux.microsoft.com/T/#m12c1b35c6d130a6ee181a8cce4c1ebdea412a19b

Thanks!
Sush & Tushar

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-10-19 18:49 [RFC V2] IMA Log Snapshotting Design Proposal Tushar Sugandhi
@ 2023-10-31 18:37 ` Ken Goldman
  2023-11-13 18:14   ` Sush Shringarputale
  2023-10-31 19:15 ` Mimi Zohar
  2023-11-13 18:59 ` Stefan Berger
  2 siblings, 1 reply; 30+ messages in thread
From: Ken Goldman @ 2023-10-31 18:37 UTC (permalink / raw)
  To: Tushar Sugandhi, linux-integrity, Mimi Zohar, peterhuewe,
	Jarkko Sakkinen, jgg, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, Paul Moore, serge,
	James Bottomley, linux-security-module
  Cc: Tyler Hicks, Lakshmi Ramasubramanian, Sush Shringarputale

On 10/19/2023 2:49 PM, Tushar Sugandhi wrote:
>    f. A new event, "snapshot_aggregate", will be computed and measured
>         in the IMA log as part of this feature.  It should help the
>         remote-attestation client/service to benefit from the IMA log
>         snapshot feature.
>         The "snapshot_aggregate" event is described in more details in
>         section "D.1 Snapshot Aggregate Event" below.

What is the use case for the snapshot aggregate?  My thinking is:

1. The platform must retain the entire measurement list.  Early 
measurements can never be discarded because a new quote verifier
must receive the entire log starting at the first measurement.

In this case, isn't the snapshot aggregate redundant?

2. There is a disadvantage to redundant data.  The verifier must support 
this new event type. It receives this event and must validate the 
aggregate against the snapshot-ed events. This is an attack surface. 
The attacker can send an aggregate and snapshot-ed measurements that do 
not match to exploit a flaw in the verifier.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-10-19 18:49 [RFC V2] IMA Log Snapshotting Design Proposal Tushar Sugandhi
  2023-10-31 18:37 ` Ken Goldman
@ 2023-10-31 19:15 ` Mimi Zohar
  2023-11-16 22:28   ` Paul Moore
  2023-11-13 18:59 ` Stefan Berger
  2 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2023-10-31 19:15 UTC (permalink / raw)
  To: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, Paul Moore, serge,
	James Bottomley, linux-security-module
  Cc: Tyler Hicks, Lakshmi Ramasubramanian, Sush Shringarputale

On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote:

[...]
> -----------------------------------------------------------------------
> | C.1 Solution Summary                                                |
> -----------------------------------------------------------------------
> To achieve the goals described in the section above, we propose the
> following changes to the IMA subsystem.
> 
>      a. The IMA log from Kernel memory will be offloaded to some
>         persistent storage disk to keep the system running reliably
>         without facing memory pressure.
>         More details, alternate approaches considered etc. are present
>         in section "D.3 Choices for Storing Snapshots" below.
> 
>      b. The IMA log will be divided into multiple chunks (snapshots).
>         Each snapshot would be a delta between the two instances when
>         the log was offloaded from memory to the persistent storage
>         disk.
> 
>      c. Some UM process (like a remote-attestation-client) will be
>         responsible for writing the IMA log snapshot to the disk.
> 
>      d. The same UM process would be responsible for triggering the IMA
>         log snapshot.
> 
>      e. There will be a well-known location for storing the IMA log
>         snapshots on the disk.  It will be non-trivial for UM processes
>         to change that location after booting into the Kernel.
> 
>      f. A new event, "snapshot_aggregate", will be computed and measured
>         in the IMA log as part of this feature.  It should help the
>         remote-attestation client/service to benefit from the IMA log
>         snapshot feature.
>         The "snapshot_aggregate" event is described in more details in
>         section "D.1 Snapshot Aggregate Event" below.
> 
>      g. If the existing remote-attestation client/services do not change
>         to benefit from this feature or do not trigger the snapshot,
>         the Kernel will continue to have it's current functionality of
>         maintaining an in-memory full IMA log.
> 
> Additionally, the remote-attestation client/services need to be updated
> to benefit from the IMA log snapshot feature.  These proposed changes
> 
> are described in section "D.4 Remote-Attestation Client/Service Side
> Changes" below, but their implementation is out of scope for this
> proposal.

As previously said on v1,
   This design seems overly complex and requires synchronization between the
   "snapshot" record and exporting the records from the measurement list. [...] 

   Concerns:
   - Pausing extending the measurement list.

Nothing has changed in terms of the complexity or in terms of pausing
the measurement list.   Pausing the measurement list is a non starter.

Userspace can already export the IMA measurement list(s) via the
securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
it wants with it.  All that is missing in the kernel is the ability to
trim the measurement list, which doesn't seem all that complicated.

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-10-31 18:37 ` Ken Goldman
@ 2023-11-13 18:14   ` Sush Shringarputale
  0 siblings, 0 replies; 30+ messages in thread
From: Sush Shringarputale @ 2023-11-13 18:14 UTC (permalink / raw)
  To: Ken Goldman, Tushar Sugandhi, linux-integrity, Mimi Zohar,
	peterhuewe, Jarkko Sakkinen, jgg, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, Paul Moore, serge,
	James Bottomley, linux-security-module
  Cc: Tyler Hicks, Lakshmi Ramasubramanian



On 10/31/2023 11:37 AM, Ken Goldman wrote:
> On 10/19/2023 2:49 PM, Tushar Sugandhi wrote:
>>    f. A new event, "snapshot_aggregate", will be computed and measured
>>         in the IMA log as part of this feature.  It should help the
>>         remote-attestation client/service to benefit from the IMA log
>>         snapshot feature.
>>         The "snapshot_aggregate" event is described in more details in
>>         section "D.1 Snapshot Aggregate Event" below.
>
> What is the use case for the snapshot aggregate?  My thinking is:
>
> 1. The platform must retain the entire measurement list.  Early 
> measurements can never be discarded because a new quote verifier
> must receive the entire log starting at the first measurement.
>
> In this case, isn't the snapshot aggregate redundant?
Not quite. The snapshot aggregate still has a purpose, which is to stitch
together the snapshots on the disk and the in-memory segment of the IMA
log. The specific details are in the RFC Section D.1, quoted here:

The "snapshot_aggregate" marker provides the following benefits:

a. It facilitates the IMA log to be divided into multiple chunks and
provides mechanism to verify the integrity of the system using only the
latest chunks during remote attestation.

b. It provides tangible evidence from Kernel to the attestation client
that IMA log snapshotting has been enabled and at least one snapshot
exists on the system.

c. It helps both the Kernel and UM attestation client define clear
boundaries between multiple snapshots.

d. In the event of multiple snapshots, the last measured
"snapshot_aggregate" marker, which is present in the current segment of
the IMA log, has sufficient information to verify the integrity of the
IMA log segment as well as the previous snapshots using the PCR quotes.

e. In the event of multiple snapshots, say N, if the remote-attestation
service has already processed the last N-1 snapshots, it can efficiently
parse through them by just processing "snapshot_aggregate" events to
compute the PCR quotes needed to verify the events in the last snapshot.
This should drastically improve the IMA log processing efficiency of
the service.

>
> 2. There is a disadvantage to redundant data.  The verifier must 
> support this new event type. It receives this event and must validate 
> the aggregate against the snapshot-ed events. This is an attack 
> surface. The attacker can send an aggregate and snapshot-ed 
> measurements that do not match to exploit a flaw in the verifier.
I disagree with this.  Redundancy is a moot point because
"snapshot_aggregate" is required for the points mentioned above.
- Sush

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-10-19 18:49 [RFC V2] IMA Log Snapshotting Design Proposal Tushar Sugandhi
  2023-10-31 18:37 ` Ken Goldman
  2023-10-31 19:15 ` Mimi Zohar
@ 2023-11-13 18:59 ` Stefan Berger
  2023-11-14 18:36   ` Sush Shringarputale
  2 siblings, 1 reply; 30+ messages in thread
From: Stefan Berger @ 2023-11-13 18:59 UTC (permalink / raw)
  To: Tushar Sugandhi, linux-integrity, Mimi Zohar, peterhuewe,
	Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, Paul Moore, serge,
	James Bottomley, linux-security-module
  Cc: Tyler Hicks, Lakshmi Ramasubramanian, Sush Shringarputale



On 10/19/23 14:49, Tushar Sugandhi wrote:
> =======================================================================
> | Introduction                                                        |
> =======================================================================
> This document provides a detailed overview of the proposed Kernel
> feature IMA log snapshotting.  It describes the motivation behind the
> proposal, the problem to be solved, a detailed solution design with
> examples, and describes the changes to be made in the clients/services
> which are part of remote-attestation system.  This is the 2nd version
> of the proposal.  The first version is present here[1].
> 
> Table of Contents:
> ------------------
> A. Motivation and Background
> B. Goals and Non-Goals
>      B.1 Goals
>      B.2 Non-Goals
> C. Proposed Solution
>      C.1 Solution Summary
>      C.2 High-level Work-flow
> D. Detailed Design
>      D.1 Snapshot Aggregate Event
>      D.2 Snapshot Triggering Mechanism
>      D.3 Choosing A Persistent Storage Location For Snapshots
>      D.4 Remote-Attestation Client/Service-side Changes
>          D.4.a Client-side Changes
>          D.4.b Service-side Changes
> E. Example Walk-through
> F. Other Design Considerations
> G. References
> 

Userspace applications will have to know
a) where are the shard files?
b) how do I read the shard files while locking out the producer of the 
shard files?

IMO, this will require a well known config file and a locking method 
(flock) so that user space applications can work together in this new 
environment. The lock could be defined in the config file or just be the 
config file itself.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-13 18:59 ` Stefan Berger
@ 2023-11-14 18:36   ` Sush Shringarputale
  2023-11-14 18:58     ` Stefan Berger
  0 siblings, 1 reply; 30+ messages in thread
From: Sush Shringarputale @ 2023-11-14 18:36 UTC (permalink / raw)
  To: Stefan Berger, Tushar Sugandhi, linux-integrity, Mimi Zohar,
	peterhuewe, Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal,
	Dave Young, kexec@lists.infradead.org, jmorris, Paul Moore, serge,
	James Bottomley, linux-security-module
  Cc: Tyler Hicks, Lakshmi Ramasubramanian



On 11/13/2023 10:59 AM, Stefan Berger wrote:
>
>
> On 10/19/23 14:49, Tushar Sugandhi wrote:
>> =======================================================================
>> | Introduction |
>> =======================================================================
>> This document provides a detailed overview of the proposed Kernel
>> feature IMA log snapshotting.  It describes the motivation behind the
>> proposal, the problem to be solved, a detailed solution design with
>> examples, and describes the changes to be made in the clients/services
>> which are part of remote-attestation system.  This is the 2nd version
>> of the proposal.  The first version is present here[1].
>>
>> Table of Contents:
>> ------------------
>> A. Motivation and Background
>> B. Goals and Non-Goals
>>      B.1 Goals
>>      B.2 Non-Goals
>> C. Proposed Solution
>>      C.1 Solution Summary
>>      C.2 High-level Work-flow
>> D. Detailed Design
>>      D.1 Snapshot Aggregate Event
>>      D.2 Snapshot Triggering Mechanism
>>      D.3 Choosing A Persistent Storage Location For Snapshots
>>      D.4 Remote-Attestation Client/Service-side Changes
>>          D.4.a Client-side Changes
>>          D.4.b Service-side Changes
>> E. Example Walk-through
>> F. Other Design Considerations
>> G. References
>>
>
> Userspace applications will have to know
> a) where are the shard files?
We describe the file storage location choices in section D.3, but user
applications will have to query the well-known location described there.
> b) how do I read the shard files while locking out the producer of the 
> shard files?
>
> IMO, this will require a well known config file and a locking method 
> (flock) so that user space applications can work together in this new 
> environment. The lock could be defined in the config file or just be 
> the config file itself.
The flock is a good idea for co-ordination between UM clients. While
the Kernel cannot enforce any access in this way, any UM process that
is planning on triggering the snapshot mechanism should follow that
protocol.  We will ensure we document that as the best-practices in
the patch series.
- Sush

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-14 18:36   ` Sush Shringarputale
@ 2023-11-14 18:58     ` Stefan Berger
  2023-11-16 22:07       ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Berger @ 2023-11-14 18:58 UTC (permalink / raw)
  To: Sush Shringarputale, Tushar Sugandhi, linux-integrity, Mimi Zohar,
	peterhuewe, Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal,
	Dave Young, kexec@lists.infradead.org, jmorris, Paul Moore, serge,
	James Bottomley, linux-security-module
  Cc: Tyler Hicks, Lakshmi Ramasubramanian



On 11/14/23 13:36, Sush Shringarputale wrote:
> 
> 
> On 11/13/2023 10:59 AM, Stefan Berger wrote:
>>
>>
>> On 10/19/23 14:49, Tushar Sugandhi wrote:
>>> =======================================================================
>>> | Introduction |
>>> =======================================================================
>>> This document provides a detailed overview of the proposed Kernel
>>> feature IMA log snapshotting.  It describes the motivation behind the
>>> proposal, the problem to be solved, a detailed solution design with
>>> examples, and describes the changes to be made in the clients/services
>>> which are part of remote-attestation system.  This is the 2nd version
>>> of the proposal.  The first version is present here[1].
>>>
>>> Table of Contents:
>>> ------------------
>>> A. Motivation and Background
>>> B. Goals and Non-Goals
>>>      B.1 Goals
>>>      B.2 Non-Goals
>>> C. Proposed Solution
>>>      C.1 Solution Summary
>>>      C.2 High-level Work-flow
>>> D. Detailed Design
>>>      D.1 Snapshot Aggregate Event
>>>      D.2 Snapshot Triggering Mechanism
>>>      D.3 Choosing A Persistent Storage Location For Snapshots
>>>      D.4 Remote-Attestation Client/Service-side Changes
>>>          D.4.a Client-side Changes
>>>          D.4.b Service-side Changes
>>> E. Example Walk-through
>>> F. Other Design Considerations
>>> G. References
>>>
>>
>> Userspace applications will have to know
>> a) where are the shard files?
> We describe the file storage location choices in section D.3, but user
> applications will have to query the well-known location described there.
>> b) how do I read the shard files while locking out the producer of the 
>> shard files?
>>
>> IMO, this will require a well known config file and a locking method 
>> (flock) so that user space applications can work together in this new 
>> environment. The lock could be defined in the config file or just be 
>> the config file itself.
> The flock is a good idea for co-ordination between UM clients. While
> the Kernel cannot enforce any access in this way, any UM process that
> is planning on triggering the snapshot mechanism should follow that
> protocol.  We will ensure we document that as the best-practices in
> the patch series.

It's more than 'best practices'. You need a well-known config file with 
well-known config options in it.

All clients that were previously just trying to read new bytes from the 
IMA log cannot do this anymore in the presence of a log shard producer 
but have to also learn that a new log shard has been produced so they 
need to figure out the new position in the log where to read from. So 
maybe a counter in a config file should indicate to the log readers that 
a new log has been produced -- otherwise they would have to monitor all 
the log shard files or the log shard file's size.

Iff the log-shard producer were configured to discard leading parts of 
the log then that should also be noted in a config file so clients, that 
need to see the beginning of the log, can refuse early on to work on a 
machine that either is configured this way or where the discarding has 
already happened.

   Stefan

> - Sush

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-14 18:58     ` Stefan Berger
@ 2023-11-16 22:07       ` Paul Moore
  2023-11-16 22:41         ` Stefan Berger
  2023-11-20 20:03         ` Tushar Sugandhi
  0 siblings, 2 replies; 30+ messages in thread
From: Paul Moore @ 2023-11-16 22:07 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Sush Shringarputale, Tushar Sugandhi, linux-integrity, Mimi Zohar,
	peterhuewe, Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal,
	Dave Young, kexec@lists.infradead.org, jmorris, serge,
	James Bottomley, linux-security-module, Tyler Hicks,
	Lakshmi Ramasubramanian

On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger <stefanb@linux.ibm.com> wrote:
> On 11/14/23 13:36, Sush Shringarputale wrote:
> > On 11/13/2023 10:59 AM, Stefan Berger wrote:
> >> On 10/19/23 14:49, Tushar Sugandhi wrote:
> >>> =======================================================================
> >>> | Introduction |
> >>> =======================================================================
> >>> This document provides a detailed overview of the proposed Kernel
> >>> feature IMA log snapshotting.  It describes the motivation behind the
> >>> proposal, the problem to be solved, a detailed solution design with
> >>> examples, and describes the changes to be made in the clients/services
> >>> which are part of remote-attestation system.  This is the 2nd version
> >>> of the proposal.  The first version is present here[1].
> >>>
> >>> Table of Contents:
> >>> ------------------
> >>> A. Motivation and Background
> >>> B. Goals and Non-Goals
> >>>      B.1 Goals
> >>>      B.2 Non-Goals
> >>> C. Proposed Solution
> >>>      C.1 Solution Summary
> >>>      C.2 High-level Work-flow
> >>> D. Detailed Design
> >>>      D.1 Snapshot Aggregate Event
> >>>      D.2 Snapshot Triggering Mechanism
> >>>      D.3 Choosing A Persistent Storage Location For Snapshots
> >>>      D.4 Remote-Attestation Client/Service-side Changes
> >>>          D.4.a Client-side Changes
> >>>          D.4.b Service-side Changes
> >>> E. Example Walk-through
> >>> F. Other Design Considerations
> >>> G. References
> >>>
> >>
> >> Userspace applications will have to know
> >> a) where are the shard files?
> > We describe the file storage location choices in section D.3, but user
> > applications will have to query the well-known location described there.
> >> b) how do I read the shard files while locking out the producer of the
> >> shard files?
> >>
> >> IMO, this will require a well known config file and a locking method
> >> (flock) so that user space applications can work together in this new
> >> environment. The lock could be defined in the config file or just be
> >> the config file itself.
> > The flock is a good idea for co-ordination between UM clients. While
> > the Kernel cannot enforce any access in this way, any UM process that
> > is planning on triggering the snapshot mechanism should follow that
> > protocol.  We will ensure we document that as the best-practices in
> > the patch series.
>
> It's more than 'best practices'. You need a well-known config file with
> well-known config options in it.
>
> All clients that were previously just trying to read new bytes from the
> IMA log cannot do this anymore in the presence of a log shard producer
> but have to also learn that a new log shard has been produced so they
> need to figure out the new position in the log where to read from. So
> maybe a counter in a config file should indicate to the log readers that
> a new log has been produced -- otherwise they would have to monitor all
> the log shard files or the log shard file's size.

If a counter is needed, I would suggest placing it somewhere other
than the config file so that we can enforce limited write access to
the config file.

Regardless, I imagine there are a few ways one could synchronize
various userspace applications such that they see a consistent view of
the decomposed log state, and the good news is that the approach
described here is opt-in from a userspace perspective.  If the
userspace does not fully support IMA log snapshotting then it never
needs to trigger it and the system behaves as it does today; on the
other hand, if the userspace has been updated it can make use of the
new functionality to better manage the size of the IMA measurement
log.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-10-31 19:15 ` Mimi Zohar
@ 2023-11-16 22:28   ` Paul Moore
  2023-11-22  1:01     ` Tushar Sugandhi
  2023-11-22  4:27     ` Paul Moore
  0 siblings, 2 replies; 30+ messages in thread
From: Paul Moore @ 2023-11-16 22:28 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote:
>
> [...]
> > -----------------------------------------------------------------------
> > | C.1 Solution Summary                                                |
> > -----------------------------------------------------------------------
> > To achieve the goals described in the section above, we propose the
> > following changes to the IMA subsystem.
> >
> >      a. The IMA log from Kernel memory will be offloaded to some
> >         persistent storage disk to keep the system running reliably
> >         without facing memory pressure.
> >         More details, alternate approaches considered etc. are present
> >         in section "D.3 Choices for Storing Snapshots" below.
> >
> >      b. The IMA log will be divided into multiple chunks (snapshots).
> >         Each snapshot would be a delta between the two instances when
> >         the log was offloaded from memory to the persistent storage
> >         disk.
> >
> >      c. Some UM process (like a remote-attestation-client) will be
> >         responsible for writing the IMA log snapshot to the disk.
> >
> >      d. The same UM process would be responsible for triggering the IMA
> >         log snapshot.
> >
> >      e. There will be a well-known location for storing the IMA log
> >         snapshots on the disk.  It will be non-trivial for UM processes
> >         to change that location after booting into the Kernel.
> >
> >      f. A new event, "snapshot_aggregate", will be computed and measured
> >         in the IMA log as part of this feature.  It should help the
> >         remote-attestation client/service to benefit from the IMA log
> >         snapshot feature.
> >         The "snapshot_aggregate" event is described in more details in
> >         section "D.1 Snapshot Aggregate Event" below.
> >
> >      g. If the existing remote-attestation client/services do not change
> >         to benefit from this feature or do not trigger the snapshot,
> >         the Kernel will continue to have it's current functionality of
> >         maintaining an in-memory full IMA log.
> >
> > Additionally, the remote-attestation client/services need to be updated
> > to benefit from the IMA log snapshot feature.  These proposed changes
> >
> > are described in section "D.4 Remote-Attestation Client/Service Side
> > Changes" below, but their implementation is out of scope for this
> > proposal.
>
> As previously said on v1,
>    This design seems overly complex and requires synchronization between the
>    "snapshot" record and exporting the records from the measurement list. [...]
>
>    Concerns:
>    - Pausing extending the measurement list.
>
> Nothing has changed in terms of the complexity or in terms of pausing
> the measurement list.   Pausing the measurement list is a non starter.

The measurement list would only need to be paused for the amount of
time it would require to generate the snapshot_aggregate entry, which
should be minimal and only occurs when a privileged userspace requests
a snapshot operation.  The snapshot remains opt-in functionality, and
even then there is the possibility that the kernel could reject the
snapshot request if generating the snapshot_aggregate entry was deemed
too costly (as determined by the kernel) at that point in time.

> Userspace can already export the IMA measurement list(s) via the
> securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> it wants with it.  All that is missing in the kernel is the ability to
> trim the measurement list, which doesn't seem all that complicated.

From my perspective what has been presented is basically just trimming
the in-memory measurement log, the additional complexity (which really
doesn't look that bad IMO) is there to ensure robustness in the face
of an unreliable userspace (processes die, get killed, etc.) and to
establish a new, transitive root of trust in the newly trimmed
in-memory log.

I suppose one could simplify things greatly by having a design where
userspace  captures the measurement log and then writes the number of
measurement records to trim from the start of the measurement log to a
sysfs file and the kernel acts on that.  You could do this with, or
without, the snapshot_aggregate entry concept; in fact that could be
something that was controlled by userspace, e.g. write the number of
lines and a flag to indicate if a snapshot_aggregate was desired to
the sysfs file.  I can't say I've thought it all the way through to
make sure there are no gotchas, but I'm guessing that is about as
simple as one can get.

If there is something else you had in mind, Mimi, please share the
details.  This is a very real problem we are facing and we want to
work to get a solution upstream.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-16 22:07       ` Paul Moore
@ 2023-11-16 22:41         ` Stefan Berger
  2023-11-16 22:56           ` Paul Moore
  2023-11-20 20:03         ` Tushar Sugandhi
  1 sibling, 1 reply; 30+ messages in thread
From: Stefan Berger @ 2023-11-16 22:41 UTC (permalink / raw)
  To: Paul Moore
  Cc: Sush Shringarputale, Tushar Sugandhi, linux-integrity, Mimi Zohar,
	peterhuewe, Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal,
	Dave Young, kexec@lists.infradead.org, jmorris, serge,
	James Bottomley, linux-security-module, Tyler Hicks,
	Lakshmi Ramasubramanian



On 11/16/23 17:07, Paul Moore wrote:
> On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger <stefanb@linux.ibm.com> wrote:
>> On 11/14/23 13:36, Sush Shringarputale wrote:
>>> On 11/13/2023 10:59 AM, Stefan Berger wrote:
>>>> On 10/19/23 14:49, Tushar Sugandhi wrote:
>>>>> =======================================================================
>>>>> | Introduction |
>>>>> =======================================================================
>>>>> This document provides a detailed overview of the proposed Kernel
>>>>> feature IMA log snapshotting.  It describes the motivation behind the
>>>>> proposal, the problem to be solved, a detailed solution design with
>>>>> examples, and describes the changes to be made in the clients/services
>>>>> which are part of remote-attestation system.  This is the 2nd version
>>>>> of the proposal.  The first version is present here[1].
>>>>>
>>>>> Table of Contents:
>>>>> ------------------
>>>>> A. Motivation and Background
>>>>> B. Goals and Non-Goals
>>>>>       B.1 Goals
>>>>>       B.2 Non-Goals
>>>>> C. Proposed Solution
>>>>>       C.1 Solution Summary
>>>>>       C.2 High-level Work-flow
>>>>> D. Detailed Design
>>>>>       D.1 Snapshot Aggregate Event
>>>>>       D.2 Snapshot Triggering Mechanism
>>>>>       D.3 Choosing A Persistent Storage Location For Snapshots
>>>>>       D.4 Remote-Attestation Client/Service-side Changes
>>>>>           D.4.a Client-side Changes
>>>>>           D.4.b Service-side Changes
>>>>> E. Example Walk-through
>>>>> F. Other Design Considerations
>>>>> G. References
>>>>>
>>>>
>>>> Userspace applications will have to know
>>>> a) where are the shard files?
>>> We describe the file storage location choices in section D.3, but user
>>> applications will have to query the well-known location described there.
>>>> b) how do I read the shard files while locking out the producer of the
>>>> shard files?
>>>>
>>>> IMO, this will require a well known config file and a locking method
>>>> (flock) so that user space applications can work together in this new
>>>> environment. The lock could be defined in the config file or just be
>>>> the config file itself.
>>> The flock is a good idea for co-ordination between UM clients. While
>>> the Kernel cannot enforce any access in this way, any UM process that
>>> is planning on triggering the snapshot mechanism should follow that
>>> protocol.  We will ensure we document that as the best-practices in
>>> the patch series.
>>
>> It's more than 'best practices'. You need a well-known config file with
>> well-known config options in it.
>>
>> All clients that were previously just trying to read new bytes from the
>> IMA log cannot do this anymore in the presence of a log shard producer
>> but have to also learn that a new log shard has been produced so they
>> need to figure out the new position in the log where to read from. So
>> maybe a counter in a config file should indicate to the log readers that
>> a new log has been produced -- otherwise they would have to monitor all
>> the log shard files or the log shard file's size.
> 
> If a counter is needed, I would suggest placing it somewhere other
> than the config file so that we can enforce limited write access to
> the config file.
> 
> Regardless, I imagine there are a few ways one could synchronize
> various userspace applications such that they see a consistent view of
> the decomposed log state, and the good news is that the approach
> described here is opt-in from a userspace perspective.  If the

A FUSE filesystem that stitches together the log shards from one or 
multiple files + IMA log file(s) could make this approach transparent 
for as long as log shards are not thrown away. Presumably it (or root) 
could bind-mount its files over the two IMA log files.

> userspace does not fully support IMA log snapshotting then it never
> needs to trigger it and the system behaves as it does today; on the

I don't think individual applications should trigger it , instead some 
dedicated background process running on a machine would do that every n 
log entries or so and possibly offer the FUSE filesystem at the same 
time. In either case, once any application triggers it, all either have 
to know how to deal with the shards or FUSE would make it completely 
transparent.

> other hand, if the userspace has been updated it can make use of the
> new functionality to better manage the size of the IMA measurement
> log.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-16 22:41         ` Stefan Berger
@ 2023-11-16 22:56           ` Paul Moore
  2023-11-17 22:41             ` Sush Shringarputale
  0 siblings, 1 reply; 30+ messages in thread
From: Paul Moore @ 2023-11-16 22:56 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Sush Shringarputale, Tushar Sugandhi, linux-integrity, Mimi Zohar,
	peterhuewe, Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal,
	Dave Young, kexec@lists.infradead.org, jmorris, serge,
	James Bottomley, linux-security-module, Tyler Hicks,
	Lakshmi Ramasubramanian

On Thu, Nov 16, 2023 at 5:41 PM Stefan Berger <stefanb@linux.ibm.com> wrote:
> On 11/16/23 17:07, Paul Moore wrote:
> > On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger <stefanb@linux.ibm.com> wrote:
> >> On 11/14/23 13:36, Sush Shringarputale wrote:
> >>> On 11/13/2023 10:59 AM, Stefan Berger wrote:
> >>>> On 10/19/23 14:49, Tushar Sugandhi wrote:
> >>>>> =======================================================================
> >>>>> | Introduction |
> >>>>> =======================================================================
> >>>>> This document provides a detailed overview of the proposed Kernel
> >>>>> feature IMA log snapshotting.  It describes the motivation behind the
> >>>>> proposal, the problem to be solved, a detailed solution design with
> >>>>> examples, and describes the changes to be made in the clients/services
> >>>>> which are part of remote-attestation system.  This is the 2nd version
> >>>>> of the proposal.  The first version is present here[1].
> >>>>>
> >>>>> Table of Contents:
> >>>>> ------------------
> >>>>> A. Motivation and Background
> >>>>> B. Goals and Non-Goals
> >>>>>       B.1 Goals
> >>>>>       B.2 Non-Goals
> >>>>> C. Proposed Solution
> >>>>>       C.1 Solution Summary
> >>>>>       C.2 High-level Work-flow
> >>>>> D. Detailed Design
> >>>>>       D.1 Snapshot Aggregate Event
> >>>>>       D.2 Snapshot Triggering Mechanism
> >>>>>       D.3 Choosing A Persistent Storage Location For Snapshots
> >>>>>       D.4 Remote-Attestation Client/Service-side Changes
> >>>>>           D.4.a Client-side Changes
> >>>>>           D.4.b Service-side Changes
> >>>>> E. Example Walk-through
> >>>>> F. Other Design Considerations
> >>>>> G. References
> >>>>>
> >>>>
> >>>> Userspace applications will have to know
> >>>> a) where are the shard files?
> >>> We describe the file storage location choices in section D.3, but user
> >>> applications will have to query the well-known location described there.
> >>>> b) how do I read the shard files while locking out the producer of the
> >>>> shard files?
> >>>>
> >>>> IMO, this will require a well known config file and a locking method
> >>>> (flock) so that user space applications can work together in this new
> >>>> environment. The lock could be defined in the config file or just be
> >>>> the config file itself.
> >>> The flock is a good idea for co-ordination between UM clients. While
> >>> the Kernel cannot enforce any access in this way, any UM process that
> >>> is planning on triggering the snapshot mechanism should follow that
> >>> protocol.  We will ensure we document that as the best-practices in
> >>> the patch series.
> >>
> >> It's more than 'best practices'. You need a well-known config file with
> >> well-known config options in it.
> >>
> >> All clients that were previously just trying to read new bytes from the
> >> IMA log cannot do this anymore in the presence of a log shard producer
> >> but have to also learn that a new log shard has been produced so they
> >> need to figure out the new position in the log where to read from. So
> >> maybe a counter in a config file should indicate to the log readers that
> >> a new log has been produced -- otherwise they would have to monitor all
> >> the log shard files or the log shard file's size.
> >
> > If a counter is needed, I would suggest placing it somewhere other
> > than the config file so that we can enforce limited write access to
> > the config file.
> >
> > Regardless, I imagine there are a few ways one could synchronize
> > various userspace applications such that they see a consistent view of
> > the decomposed log state, and the good news is that the approach
> > described here is opt-in from a userspace perspective.  If the
>
> A FUSE filesystem that stitches together the log shards from one or
> multiple files + IMA log file(s) could make this approach transparent
> for as long as log shards are not thrown away. Presumably it (or root)
> could bind-mount its files over the two IMA log files.
>
> > userspace does not fully support IMA log snapshotting then it never
> > needs to trigger it and the system behaves as it does today; on the
>
> I don't think individual applications should trigger it , instead some
> dedicated background process running on a machine would do that every n
> log entries or so and possibly offer the FUSE filesystem at the same
> time. In either case, once any application triggers it, all either have
> to know how to deal with the shards or FUSE would make it completely
> transparent.

Yes, performing a snapshot is a privileged operation which I expect
would be done and managed by a dedicated daemon running on the system.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-16 22:56           ` Paul Moore
@ 2023-11-17 22:41             ` Sush Shringarputale
  0 siblings, 0 replies; 30+ messages in thread
From: Sush Shringarputale @ 2023-11-17 22:41 UTC (permalink / raw)
  To: Paul Moore, Stefan Berger
  Cc: Tushar Sugandhi, linux-integrity, Mimi Zohar, peterhuewe,
	Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian



On 11/16/2023 2:56 PM, Paul Moore wrote:
> On Thu, Nov 16, 2023 at 5:41 PM Stefan Berger <stefanb@linux.ibm.com> wrote:
>> On 11/16/23 17:07, Paul Moore wrote:
>>> On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger <stefanb@linux.ibm.com> wrote:
>>>> On 11/14/23 13:36, Sush Shringarputale wrote:
>>>>> On 11/13/2023 10:59 AM, Stefan Berger wrote:
>>>>>> On 10/19/23 14:49, Tushar Sugandhi wrote:
>>>>>>> =======================================================================
>>>>>>> | Introduction |
>>>>>>> =======================================================================
>>>>>>> This document provides a detailed overview of the proposed Kernel
>>>>>>> feature IMA log snapshotting.  It describes the motivation behind the
>>>>>>> proposal, the problem to be solved, a detailed solution design with
>>>>>>> examples, and describes the changes to be made in the clients/services
>>>>>>> which are part of remote-attestation system.  This is the 2nd version
>>>>>>> of the proposal.  The first version is present here[1].
>>>>>>>
>>>>>>> Table of Contents:
>>>>>>> ------------------
>>>>>>> A. Motivation and Background
>>>>>>> B. Goals and Non-Goals
>>>>>>>        B.1 Goals
>>>>>>>        B.2 Non-Goals
>>>>>>> C. Proposed Solution
>>>>>>>        C.1 Solution Summary
>>>>>>>        C.2 High-level Work-flow
>>>>>>> D. Detailed Design
>>>>>>>        D.1 Snapshot Aggregate Event
>>>>>>>        D.2 Snapshot Triggering Mechanism
>>>>>>>        D.3 Choosing A Persistent Storage Location For Snapshots
>>>>>>>        D.4 Remote-Attestation Client/Service-side Changes
>>>>>>>            D.4.a Client-side Changes
>>>>>>>            D.4.b Service-side Changes
>>>>>>> E. Example Walk-through
>>>>>>> F. Other Design Considerations
>>>>>>> G. References
>>>>>>>
>>>>>> Userspace applications will have to know
>>>>>> a) where are the shard files?
>>>>> We describe the file storage location choices in section D.3, but user
>>>>> applications will have to query the well-known location described there.
>>>>>> b) how do I read the shard files while locking out the producer of the
>>>>>> shard files?
>>>>>>
>>>>>> IMO, this will require a well known config file and a locking method
>>>>>> (flock) so that user space applications can work together in this new
>>>>>> environment. The lock could be defined in the config file or just be
>>>>>> the config file itself.
>>>>> The flock is a good idea for co-ordination between UM clients. While
>>>>> the Kernel cannot enforce any access in this way, any UM process that
>>>>> is planning on triggering the snapshot mechanism should follow that
>>>>> protocol.  We will ensure we document that as the best-practices in
>>>>> the patch series.
>>>> It's more than 'best practices'. You need a well-known config file with
>>>> well-known config options in it.
>>>>
>>>> All clients that were previously just trying to read new bytes from the
>>>> IMA log cannot do this anymore in the presence of a log shard producer
>>>> but have to also learn that a new log shard has been produced so they
>>>> need to figure out the new position in the log where to read from. So
>>>> maybe a counter in a config file should indicate to the log readers that
>>>> a new log has been produced -- otherwise they would have to monitor all
>>>> the log shard files or the log shard file's size.
>>> If a counter is needed, I would suggest placing it somewhere other
>>> than the config file so that we can enforce limited write access to
>>> the config file.
>>>
>>> Regardless, I imagine there are a few ways one could synchronize
>>> various userspace applications such that they see a consistent view of
>>> the decomposed log state, and the good news is that the approach
>>> described here is opt-in from a userspace perspective.  If the
>> A FUSE filesystem that stitches together the log shards from one or
>> multiple files + IMA log file(s) could make this approach transparent
>> for as long as log shards are not thrown away. Presumably it (or root)
>> could bind-mount its files over the two IMA log files.
>>
>>> userspace does not fully support IMA log snapshotting then it never
>>> needs to trigger it and the system behaves as it does today; on the
>> I don't think individual applications should trigger it , instead some
>> dedicated background process running on a machine would do that every n
>> log entries or so and possibly offer the FUSE filesystem at the same
>> time. In either case, once any application triggers it, all either have
>> to know how to deal with the shards or FUSE would make it completely
>> transparent.
FUSE would be a reasonable user space co-ordination implementation.  A
privileged process would trigger the snapshot generation and provide the
mountpoint to read the full IMA log backed by shards as needed by relying
parties.

Whether it is a privileged daemon or some other agent that triggers the
snapshot, it shouldn't impact the Kernel-side implementation.

- Sush
> Yes, performing a snapshot is a privileged operation which I expect
> would be done and managed by a dedicated daemon running on the system.
>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-16 22:07       ` Paul Moore
  2023-11-16 22:41         ` Stefan Berger
@ 2023-11-20 20:03         ` Tushar Sugandhi
  1 sibling, 0 replies; 30+ messages in thread
From: Tushar Sugandhi @ 2023-11-20 20:03 UTC (permalink / raw)
  To: Paul Moore, Stefan Berger
  Cc: Sush Shringarputale, linux-integrity, Mimi Zohar, peterhuewe,
	Jarkko Sakkinen, jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian



On 11/16/23 14:07, Paul Moore wrote:
> On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger <stefanb@linux.ibm.com> wrote:
>> On 11/14/23 13:36, Sush Shringarputale wrote:
>>> On 11/13/2023 10:59 AM, Stefan Berger wrote:
>>>> On 10/19/23 14:49, Tushar Sugandhi wrote:
>>>>> =======================================================================
>>>>> | Introduction |
>>>>> =======================================================================
>>>>> This document provides a detailed overview of the proposed Kernel
>>>>> feature IMA log snapshotting.  It describes the motivation behind the
>>>>> proposal, the problem to be solved, a detailed solution design with
>>>>> examples, and describes the changes to be made in the clients/services
>>>>> which are part of remote-attestation system.  This is the 2nd version
>>>>> of the proposal.  The first version is present here[1].
>>>>>
>>>>> Table of Contents:
>>>>> ------------------
>>>>> A. Motivation and Background
>>>>> B. Goals and Non-Goals
>>>>>       B.1 Goals
>>>>>       B.2 Non-Goals
>>>>> C. Proposed Solution
>>>>>       C.1 Solution Summary
>>>>>       C.2 High-level Work-flow
>>>>> D. Detailed Design
>>>>>       D.1 Snapshot Aggregate Event
>>>>>       D.2 Snapshot Triggering Mechanism
>>>>>       D.3 Choosing A Persistent Storage Location For Snapshots
>>>>>       D.4 Remote-Attestation Client/Service-side Changes
>>>>>           D.4.a Client-side Changes
>>>>>           D.4.b Service-side Changes
>>>>> E. Example Walk-through
>>>>> F. Other Design Considerations
>>>>> G. References
>>>>>
>>>>
>>>> Userspace applications will have to know
>>>> a) where are the shard files?
>>> We describe the file storage location choices in section D.3, but user
>>> applications will have to query the well-known location described there.
>>>> b) how do I read the shard files while locking out the producer of the
>>>> shard files?
>>>>
>>>> IMO, this will require a well known config file and a locking method
>>>> (flock) so that user space applications can work together in this new
>>>> environment. The lock could be defined in the config file or just be
>>>> the config file itself.
>>> The flock is a good idea for co-ordination between UM clients. While
>>> the Kernel cannot enforce any access in this way, any UM process that
>>> is planning on triggering the snapshot mechanism should follow that
>>> protocol.  We will ensure we document that as the best-practices in
>>> the patch series.
>>
>> It's more than 'best practices'. You need a well-known config file with
>> well-known config options in it.
>>
>> All clients that were previously just trying to read new bytes from the
>> IMA log cannot do this anymore in the presence of a log shard producer
>> but have to also learn that a new log shard has been produced so they
>> need to figure out the new position in the log where to read from. So
>> maybe a counter in a config file should indicate to the log readers that
>> a new log has been produced -- otherwise they would have to monitor all
>> the log shard files or the log shard file's size.
> 
> If a counter is needed, I would suggest placing it somewhere other
> than the config file so that we can enforce limited write access to
> the config file.
> 
Agreed. The counter shouldn't be part of a config file.

IMA log already provides a trustworthy, tamper-resilient mechanism
to store such data.

The current design already provides the mechanism to store
the counter as part of the snapshot_aggregate event.

See section "D.1 Snapshot Aggregate Event" in the proposal for
reference.

Snapshot_Counter   := "Snapshot_Attempt_Count="
                               <num. snapshot attempts>


"snapshot_aggregate" becomes the first event recorded in the
in-memory IMA log, after the past entries are purged to
a shard file.  Along with the other benefits, the "snapshot_aggregate"
event also provides info to UM clients about how many snapshots are
taken so far.


See section "C.2 High-level Work-flow" in the proposal for more
info.

           Step #f
           ---------
      (In-memory IMA log)
    .----------------------.
    | "snapshot_aggregate" |
    | Event #E4            |
    | Event #E5            |
    '----------------------'

~Tushar
> Regardless, I imagine there are a few ways one could synchronize
> various userspace applications such that they see a consistent view of
> the decomposed log state, and the good news is that the approach
> described here is opt-in from a userspace perspective.  If the
> userspace does not fully support IMA log snapshotting then it never
> needs to trigger it and the system behaves as it does today; on the
> other hand, if the userspace has been updated it can make use of the
> new functionality to better manage the size of the IMA measurement
> log.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-16 22:28   ` Paul Moore
@ 2023-11-22  1:01     ` Tushar Sugandhi
  2023-11-22  1:18       ` Mimi Zohar
  2023-11-22  4:27     ` Paul Moore
  1 sibling, 1 reply; 30+ messages in thread
From: Tushar Sugandhi @ 2023-11-22  1:01 UTC (permalink / raw)
  To: Paul Moore, Mimi Zohar
  Cc: linux-integrity, peterhuewe, Jarkko Sakkinen, jgg, Ken Goldman,
	bhe, vgoyal, Dave Young, kexec@lists.infradead.org, jmorris,
	serge, James Bottomley, linux-security-module, Tyler Hicks,
	Lakshmi Ramasubramanian, Sush Shringarputale



On 11/16/23 14:28, Paul Moore wrote:
> On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
>> On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote:
>>
>> [...]
>>> -----------------------------------------------------------------------
>>> | C.1 Solution Summary                                                |
>>> -----------------------------------------------------------------------
>>> To achieve the goals described in the section above, we propose the
>>> following changes to the IMA subsystem.
>>>
>>>       a. The IMA log from Kernel memory will be offloaded to some
>>>          persistent storage disk to keep the system running reliably
>>>          without facing memory pressure.
>>>          More details, alternate approaches considered etc. are present
>>>          in section "D.3 Choices for Storing Snapshots" below.
>>>
>>>       b. The IMA log will be divided into multiple chunks (snapshots).
>>>          Each snapshot would be a delta between the two instances when
>>>          the log was offloaded from memory to the persistent storage
>>>          disk.
>>>
>>>       c. Some UM process (like a remote-attestation-client) will be
>>>          responsible for writing the IMA log snapshot to the disk.
>>>
>>>       d. The same UM process would be responsible for triggering the IMA
>>>          log snapshot.
>>>
>>>       e. There will be a well-known location for storing the IMA log
>>>          snapshots on the disk.  It will be non-trivial for UM processes
>>>          to change that location after booting into the Kernel.
>>>
>>>       f. A new event, "snapshot_aggregate", will be computed and measured
>>>          in the IMA log as part of this feature.  It should help the
>>>          remote-attestation client/service to benefit from the IMA log
>>>          snapshot feature.
>>>          The "snapshot_aggregate" event is described in more details in
>>>          section "D.1 Snapshot Aggregate Event" below.
>>>
>>>       g. If the existing remote-attestation client/services do not change
>>>          to benefit from this feature or do not trigger the snapshot,
>>>          the Kernel will continue to have it's current functionality of
>>>          maintaining an in-memory full IMA log.
>>>
>>> Additionally, the remote-attestation client/services need to be updated
>>> to benefit from the IMA log snapshot feature.  These proposed changes
>>>
>>> are described in section "D.4 Remote-Attestation Client/Service Side
>>> Changes" below, but their implementation is out of scope for this
>>> proposal.
>>
>> As previously said on v1,
>>     This design seems overly complex and requires synchronization between the
>>     "snapshot" record and exporting the records from the measurement list. [...]
>>
>>     Concerns:
>>     - Pausing extending the measurement list.
>>
>> Nothing has changed in terms of the complexity or in terms of pausing
>> the measurement list.   Pausing the measurement list is a non starter.
> 
> The measurement list would only need to be paused for the amount of
> time it would require to generate the snapshot_aggregate entry, which
> should be minimal and only occurs when a privileged userspace requests
> a snapshot operation.  The snapshot remains opt-in functionality, and
> even then there is the possibility that the kernel could reject the
> snapshot request if generating the snapshot_aggregate entry was deemed
> too costly (as determined by the kernel) at that point in time.
> 
Thanks Paul for responding and sharing your thoughts.


Hi Mimi,
To address your concern about pausing the measurements -
We are not proposing to pause the measurements for the entire duration
of UM <--> Kernel interaction while taking a snapshot.

We are simply proposing to pause the measurements when we get the TPM
PCR quotes to add them to "snapshot_aggregate". (which should be a very
small time window). IMA already has this mechanism when two separate
modules try to add entry to IMA log - by using
mutex_lock(&ima_extend_list_mutex); in ima_add_template_entry.


We plan to use this existing locking functionality.
Hope this addresses your concern about pausing extending the measurement
list.

~Tushar

>> Userspace can already export the IMA measurement list(s) via the
>> securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
>> it wants with it.  All that is missing in the kernel is the ability to
>> trim the measurement list, which doesn't seem all that complicated.
> 
>>From my perspective what has been presented is basically just trimming
> the in-memory measurement log, the additional complexity (which really
> doesn't look that bad IMO) is there to ensure robustness in the face
> of an unreliable userspace (processes die, get killed, etc.) and to
> establish a new, transitive root of trust in the newly trimmed
> in-memory log.
> 
> I suppose one could simplify things greatly by having a design where
> userspace  captures the measurement log and then writes the number of
> measurement records to trim from the start of the measurement log to a
> sysfs file and the kernel acts on that.  You could do this with, or
> without, the snapshot_aggregate entry concept; in fact that could be
> something that was controlled by userspace, e.g. write the number of
> lines and a flag to indicate if a snapshot_aggregate was desired to
> the sysfs file.  I can't say I've thought it all the way through to
> make sure there are no gotchas, but I'm guessing that is about as
> simple as one can get.
> 
> If there is something else you had in mind, Mimi, please share the
> details.  This is a very real problem we are facing and we want to
> work to get a solution upstream.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-22  1:01     ` Tushar Sugandhi
@ 2023-11-22  1:18       ` Mimi Zohar
  0 siblings, 0 replies; 30+ messages in thread
From: Mimi Zohar @ 2023-11-22  1:18 UTC (permalink / raw)
  To: Tushar Sugandhi, Paul Moore
  Cc: linux-integrity, peterhuewe, Jarkko Sakkinen, jgg, Ken Goldman,
	bhe, vgoyal, Dave Young, kexec@lists.infradead.org, jmorris,
	serge, James Bottomley, linux-security-module, Tyler Hicks,
	Lakshmi Ramasubramanian, Sush Shringarputale

On Tue, 2023-11-21 at 17:01 -0800, Tushar Sugandhi wrote:
> Hi Mimi,
> To address your concern about pausing the measurements -
> We are not proposing to pause the measurements for the entire duration
> of UM <--> Kernel interaction while taking a snapshot.
> 
> We are simply proposing to pause the measurements when we get the TPM
> PCR quotes to add them to "snapshot_aggregate". (which should be a very
> small time window). IMA already has this mechanism when two separate
> modules try to add entry to IMA log - by using
> mutex_lock(&ima_extend_list_mutex); in ima_add_template_entry.
> 
> 
> We plan to use this existing locking functionality.
> Hope this addresses your concern about pausing extending the measurement
> list.

Each TPM PCR read is a separate TPM command.  Have you done any
performance anlaysis to see how long it actually takes to calculate the
"snapshot_aggregate" with a physical TPM?

The "snapshot_aggregate" is a new critical-data and should be
upstreamed independently of this patch set.

-- 
thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-16 22:28   ` Paul Moore
  2023-11-22  1:01     ` Tushar Sugandhi
@ 2023-11-22  4:27     ` Paul Moore
  2023-11-22 13:18       ` Mimi Zohar
  1 sibling, 1 reply; 30+ messages in thread
From: Paul Moore @ 2023-11-22  4:27 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Thu, Nov 16, 2023 at 5:28 PM Paul Moore <paul@paul-moore.com> wrote:
> On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar <zohar@linux.ibm.com> wrote:

...

> > Userspace can already export the IMA measurement list(s) via the
> > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > it wants with it.  All that is missing in the kernel is the ability to
> > trim the measurement list, which doesn't seem all that complicated.
>
> From my perspective what has been presented is basically just trimming
> the in-memory measurement log, the additional complexity (which really
> doesn't look that bad IMO) is there to ensure robustness in the face
> of an unreliable userspace (processes die, get killed, etc.) and to
> establish a new, transitive root of trust in the newly trimmed
> in-memory log.
>
> I suppose one could simplify things greatly by having a design where
> userspace  captures the measurement log and then writes the number of
> measurement records to trim from the start of the measurement log to a
> sysfs file and the kernel acts on that.  You could do this with, or
> without, the snapshot_aggregate entry concept; in fact that could be
> something that was controlled by userspace, e.g. write the number of
> lines and a flag to indicate if a snapshot_aggregate was desired to
> the sysfs file.  I can't say I've thought it all the way through to
> make sure there are no gotchas, but I'm guessing that is about as
> simple as one can get.
>
> If there is something else you had in mind, Mimi, please share the
> details.  This is a very real problem we are facing and we want to
> work to get a solution upstream.

Any thoughts on this Mimi?  We have a real interest in working with
you to solve this problem upstream, but we need more detailed feedback
than "too complicated".  If you don't like the solutions presented
thus far, what type of solution would you like to see?

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-22  4:27     ` Paul Moore
@ 2023-11-22 13:18       ` Mimi Zohar
  2023-11-22 14:22         ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2023-11-22 13:18 UTC (permalink / raw)
  To: Paul Moore
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote:
> On Thu, Nov 16, 2023 at 5:28 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> 
> ...
> 
> > > Userspace can already export the IMA measurement list(s) via the
> > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > > it wants with it.  All that is missing in the kernel is the ability to
> > > trim the measurement list, which doesn't seem all that complicated.
> >
> > From my perspective what has been presented is basically just trimming
> > the in-memory measurement log, the additional complexity (which really
> > doesn't look that bad IMO) is there to ensure robustness in the face
> > of an unreliable userspace (processes die, get killed, etc.) and to
> > establish a new, transitive root of trust in the newly trimmed
> > in-memory log.
> >
> > I suppose one could simplify things greatly by having a design where
> > userspace  captures the measurement log and then writes the number of
> > measurement records to trim from the start of the measurement log to a
> > sysfs file and the kernel acts on that.  You could do this with, or
> > without, the snapshot_aggregate entry concept; in fact that could be
> > something that was controlled by userspace, e.g. write the number of
> > lines and a flag to indicate if a snapshot_aggregate was desired to
> > the sysfs file.  I can't say I've thought it all the way through to
> > make sure there are no gotchas, but I'm guessing that is about as
> > simple as one can get.

> > If there is something else you had in mind, Mimi, please share the
> > details.  This is a very real problem we are facing and we want to
> > work to get a solution upstream.
> 
> Any thoughts on this Mimi?  We have a real interest in working with
> you to solve this problem upstream, but we need more detailed feedback
> than "too complicated".  If you don't like the solutions presented
> thus far, what type of solution would you like to see?

Paul, the design copies the measurement list to a temporary "snapshot"
file, before trimming the measurement list, which according to the
design document locks the existing measurement list.  And further
pauses extending the measurement list to calculate the
"snapshot_aggregate".

Userspace can export the measurement list already, so why this
complicated design?

As I mentioned previously and repeated yesterday, the
"snapshot_aggregate" is a new type of critical data and should be
upstreamed independently of this patch set that trims the measurement
list.  Trimming the measurement list could be based, as you suggested
on the number of records to remove, or it could be up to the next/last
"snapshot_aggregate" record.

-- 
thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-22 13:18       ` Mimi Zohar
@ 2023-11-22 14:22         ` Paul Moore
  2023-11-27 17:07           ` Mimi Zohar
  2023-12-20 22:13           ` Ken Goldman
  0 siblings, 2 replies; 30+ messages in thread
From: Paul Moore @ 2023-11-22 14:22 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Wed, Nov 22, 2023 at 8:18 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote:
> > On Thu, Nov 16, 2023 at 5:28 PM Paul Moore <paul@paul-moore.com> wrote:
> > > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> >
> > ...
> >
> > > > Userspace can already export the IMA measurement list(s) via the
> > > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > > > it wants with it.  All that is missing in the kernel is the ability to
> > > > trim the measurement list, which doesn't seem all that complicated.
> > >
> > > From my perspective what has been presented is basically just trimming
> > > the in-memory measurement log, the additional complexity (which really
> > > doesn't look that bad IMO) is there to ensure robustness in the face
> > > of an unreliable userspace (processes die, get killed, etc.) and to
> > > establish a new, transitive root of trust in the newly trimmed
> > > in-memory log.
> > >
> > > I suppose one could simplify things greatly by having a design where
> > > userspace  captures the measurement log and then writes the number of
> > > measurement records to trim from the start of the measurement log to a
> > > sysfs file and the kernel acts on that.  You could do this with, or
> > > without, the snapshot_aggregate entry concept; in fact that could be
> > > something that was controlled by userspace, e.g. write the number of
> > > lines and a flag to indicate if a snapshot_aggregate was desired to
> > > the sysfs file.  I can't say I've thought it all the way through to
> > > make sure there are no gotchas, but I'm guessing that is about as
> > > simple as one can get.
>
> > > If there is something else you had in mind, Mimi, please share the
> > > details.  This is a very real problem we are facing and we want to
> > > work to get a solution upstream.
> >
> > Any thoughts on this Mimi?  We have a real interest in working with
> > you to solve this problem upstream, but we need more detailed feedback
> > than "too complicated".  If you don't like the solutions presented
> > thus far, what type of solution would you like to see?
>
> Paul, the design copies the measurement list to a temporary "snapshot"
> file, before trimming the measurement list, which according to the
> design document locks the existing measurement list.  And further
> pauses extending the measurement list to calculate the
> "snapshot_aggregate".

I believe the intent is to only pause the measurements while the
snapshot_aggregate is generated, not for the duration of the entire
snapshot process.  The purpose of the snapshot_aggregate is to
establish a new root of trust, similar to the boot_aggregate, to help
improve attestation performance.

> Userspace can export the measurement list already, so why this
> complicated design?

The current code has no provision for trimming the measurement log,
that's the primary reason.

> As I mentioned previously and repeated yesterday, the
> "snapshot_aggregate" is a new type of critical data and should be
> upstreamed independently of this patch set that trims the measurement
> list.  Trimming the measurement list could be based, as you suggested
> on the number of records to remove, or it could be up to the next/last
> "snapshot_aggregate" record.

Okay, we are starting to get closer, but I'm still missing the part
where you say "if you do X, Y, and Z, I'll accept and merge the
solution."  Can you be more explicit about what approach(es) you would
be willing to accept upstream?

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-22 14:22         ` Paul Moore
@ 2023-11-27 17:07           ` Mimi Zohar
  2023-11-27 22:16             ` Paul Moore
  2023-12-20 22:13           ` Ken Goldman
  1 sibling, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2023-11-27 17:07 UTC (permalink / raw)
  To: Paul Moore
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> On Wed, Nov 22, 2023 at 8:18 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote:
> > > On Thu, Nov 16, 2023 at 5:28 PM Paul Moore <paul@paul-moore.com> wrote:
> > > > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > >
> > > ...
> > >
> > > > > Userspace can already export the IMA measurement list(s) via the
> > > > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > > > > it wants with it.  All that is missing in the kernel is the ability to
> > > > > trim the measurement list, which doesn't seem all that complicated.
> > > >
> > > > From my perspective what has been presented is basically just trimming
> > > > the in-memory measurement log, the additional complexity (which really
> > > > doesn't look that bad IMO) is there to ensure robustness in the face
> > > > of an unreliable userspace (processes die, get killed, etc.) and to
> > > > establish a new, transitive root of trust in the newly trimmed
> > > > in-memory log.
> > > >
> > > > I suppose one could simplify things greatly by having a design where
> > > > userspace  captures the measurement log and then writes the number of
> > > > measurement records to trim from the start of the measurement log to a
> > > > sysfs file and the kernel acts on that.  You could do this with, or
> > > > without, the snapshot_aggregate entry concept; in fact that could be
> > > > something that was controlled by userspace, e.g. write the number of
> > > > lines and a flag to indicate if a snapshot_aggregate was desired to
> > > > the sysfs file.  I can't say I've thought it all the way through to
> > > > make sure there are no gotchas, but I'm guessing that is about as
> > > > simple as one can get.
> >
> > > > If there is something else you had in mind, Mimi, please share the
> > > > details.  This is a very real problem we are facing and we want to
> > > > work to get a solution upstream.
> > >
> > > Any thoughts on this Mimi?  We have a real interest in working with
> > > you to solve this problem upstream, but we need more detailed feedback
> > > than "too complicated".  If you don't like the solutions presented
> > > thus far, what type of solution would you like to see?
> >
> > Paul, the design copies the measurement list to a temporary "snapshot"
> > file, before trimming the measurement list, which according to the
> > design document locks the existing measurement list.  And further
> > pauses extending the measurement list to calculate the
> > "snapshot_aggregate".
> 
> I believe the intent is to only pause the measurements while the
> snapshot_aggregate is generated, not for the duration of the entire
> snapshot process.  The purpose of the snapshot_aggregate is to
> establish a new root of trust, similar to the boot_aggregate, to help
> improve attestation performance.
> 
> > Userspace can export the measurement list already, so why this
> > complicated design?
> 
> The current code has no provision for trimming the measurement log,
> that's the primary reason.
> 
> > As I mentioned previously and repeated yesterday, the
> > "snapshot_aggregate" is a new type of critical data and should be
> > upstreamed independently of this patch set that trims the measurement
> > list.  Trimming the measurement list could be based, as you suggested
> > on the number of records to remove, or it could be up to the next/last
> > "snapshot_aggregate" record.
> 
> Okay, we are starting to get closer, but I'm still missing the part
> where you say "if you do X, Y, and Z, I'll accept and merge the
> solution."  Can you be more explicit about what approach(es) you would
> be willing to accept upstream?

Included with what is wanted/needed is an explanation as to my concerns
with the existing proposal.

First we need to differentiate between kernel and uhserspace
requirements.  (The "snapshotting" design proposal intermixes them.)

From the kernel persective, the Log Snapshotting Design proposal "B.1
Goals" is very nice, but once the measurement list can be trimmed it is
really irrelevant.  Userspace can do whatever it wants with the
measurement list records.  So instead of paying lip service to what
should be done, just call it as it is - trimming the measurement list.

-----------------------------------------------------------------------
| B.1 Goals                                                           |
-----------------------------------------------------------------------
To address the issues described in the section above, we propose
enhancements to the IMA subsystem to achieve the following goals:

  a. Reduce memory pressure on the Kernel caused by larger in-memory
     IMA logs.

  b. Preserve the system's ability to get remotely attested using the
     IMA log, even after implementing the enhancements to reduce memory
     pressure caused by the IMA log. IMA's Integrity guarantees should
     be maintained.

  c. Provide mechanisms from Kernel side to the remote attestation
     service to make service-side processing more efficient.

From the kernel perspective there needs to be a method of trimming N
number of records from the head of the measurement list.  In addition
to the existing securityfs "runtime measurement list",  defining a new
securityfs file containing the current count of in memory measurement
records would be beneficial.  Defining other IMA securityfs files like
how many times the measurement list has been trimmed might be
beneficial as well.  Of course properly document the integrity
implications and repercussions of the new Kconfig that allows trimming
the measurement list.

Defining a simple "trim" marker measurement record would be a visual
indication that the measurement list has been trimmed.  I might even
have compared it to the "boot_aggregate".  However, the proposed marker
based on TPM PCRs requires pausing extending the measurement list.  
Although the TCG TPM spec allows reading multiple PCRs, it may fail due
to the output buffer size.  To avoid TPM read multiple PCRs failure,
reading one TPM PCR value at a time is safer.  The more TPM banks and
PCRs needed the longer it will take.  Remember this critical-data
record won't be limited to just software TPMs, but could be used with
physical ones as well.  For a physical TPM, this could be on the orderof 240 ms per TPM bank (24 PCRs). 

Before defining a new critical-data record, we need to decide whether
it is really necessary or if it is redundant.  If we define a new
"critical-data" record, can it be defined such that it doesn't require
pausing extending the measurement list?  For example, a new simple
visual critical-data record could contain the number of records (e.g.
<securityfs>/ima/runtime_measurements_count) up to that point.

The new critical-data record and trimming the measurement list should
be disjoint features.  If the first record after trimming the
measurement list should be the critical-data record, then trim the
measurement list up to that point.

From a userspace perspective, trimming the measurement list is a major
change and will break existing attestation requests, unless the change
is transparent.  Removing "snapshots"/"shards" will of course break
attestation requests.  Refer to Stefan's suggestions: 
https://lore.kernel.org/linux-integrity/1ed2d72c-4cb2-48b3-bb0f-b0877fc1e9ca@linux.ibm.com/

-- 
thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-27 17:07           ` Mimi Zohar
@ 2023-11-27 22:16             ` Paul Moore
  2023-11-28 12:09               ` Mimi Zohar
  0 siblings, 1 reply; 30+ messages in thread
From: Paul Moore @ 2023-11-27 22:16 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:

...

> > Okay, we are starting to get closer, but I'm still missing the part
> > where you say "if you do X, Y, and Z, I'll accept and merge the
> > solution."  Can you be more explicit about what approach(es) you would
> > be willing to accept upstream?
>
> Included with what is wanted/needed is an explanation as to my concerns
> with the existing proposal.
>
> First we need to differentiate between kernel and uhserspace
> requirements.  (The "snapshotting" design proposal intermixes them.)
>
> From the kernel persective, the Log Snapshotting Design proposal "B.1
> Goals" is very nice, but once the measurement list can be trimmed it is
> really irrelevant.  Userspace can do whatever it wants with the
> measurement list records.  So instead of paying lip service to what
> should be done, just call it as it is - trimming the measurement list.

Fair enough.  I personally think it is nice to have a brief discussion
of how userspace might use a kernel feature, but if you prefer to drop
that part of the design doc I doubt anyone will object very strongly.

> -----------------------------------------------------------------------
> | B.1 Goals                                                           |
> -----------------------------------------------------------------------
> To address the issues described in the section above, we propose
> enhancements to the IMA subsystem to achieve the following goals:
>
>   a. Reduce memory pressure on the Kernel caused by larger in-memory
>      IMA logs.
>
>   b. Preserve the system's ability to get remotely attested using the
>      IMA log, even after implementing the enhancements to reduce memory
>      pressure caused by the IMA log. IMA's Integrity guarantees should
>      be maintained.
>
>   c. Provide mechanisms from Kernel side to the remote attestation
>      service to make service-side processing more efficient.

That looks fine to me.

> From the kernel perspective there needs to be a method of trimming N
> number of records from the head of the measurement list.  In addition
> to the existing securityfs "runtime measurement list",  defining a new
> securityfs file containing the current count of in memory measurement
> records would be beneficial.

I imagine that should be trivial to implement and I can't imagine
there being any objection to that.

If we are going to have a record count, I imagine it would also be
helpful to maintain a securityfs file with the total size (in bytes)
of the in-memory measurement log.  In fact, I suspect this will
probably be more useful for those who wish to manage the size of the
measurement log.

> Defining other IMA securityfs files like
> how many times the measurement list has been trimmed might be
> beneficial as well.

I have no objection to that.  Would a total record count, i.e. a value
that doesn't reset on a snapshot event, be more useful here?

> Of course properly document the integrity
> implications and repercussions of the new Kconfig that allows trimming
> the measurement list.

Of course.

> Defining a simple "trim" marker measurement record would be a visual
> indication that the measurement list has been trimmed.  I might even
> have compared it to the "boot_aggregate".  However, the proposed marker
> based on TPM PCRs requires pausing extending the measurement list.

...

> Before defining a new critical-data record, we need to decide whether
> it is really necessary or if it is redundant.  If we define a new
> "critical-data" record, can it be defined such that it doesn't require
> pausing extending the measurement list?  For example, a new simple
> visual critical-data record could contain the number of records (e.g.
> <securityfs>/ima/runtime_measurements_count) up to that point.

What if the snapshot_aggregate was a hash of the measurement log
starting with either the boot_aggregate or the latest
snapshot_aggregate and ending on the record before the new
snapshot_aggregate?  The performance impact at snapshot time should be
minimal as the hash can be incrementally updated as new records are
added to the measurement list.  While the hash wouldn't capture the
TPM state, it would allow some crude verification when reassembling
the log.  If one could bear the cost of a TPM signing operation, the
log digest could be signed by the TPM.

> The new critical-data record and trimming the measurement list should
> be disjoint features.  If the first record after trimming the
> measurement list should be the critical-data record, then trim the
> measurement list up to that point.

I disagree about the snapshot_aggregate record being disjoint from the
measurement log, but I suspect Tushar and Sush are willing to forgo
the snapshot_aggregate if that is a blocker from your perspective.
Once again, the main goal is the ability to manage the size of the
measurement log; while having a snapshot_aggregate that can be used to
establish a root of trust similar to the boot_aggregate is nice, it is
not a MUST have.

> From a userspace perspective, trimming the measurement list is a major
> change and will break existing attestation requests, unless the change
> is transparent.  Removing "snapshots"/"shards" will of course break
> attestation requests.  Refer to Stefan's suggestions:
> https://lore.kernel.org/linux-integrity/1ed2d72c-4cb2-48b3-bb0f-b0877fc1e9ca@linux.ibm.com/

You will note that Sush and I replied to Stefan two weeks ago.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-27 22:16             ` Paul Moore
@ 2023-11-28 12:09               ` Mimi Zohar
  2023-11-29  1:06                 ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2023-11-28 12:09 UTC (permalink / raw)
  To: Paul Moore
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> 
> ...
> 
> > > Okay, we are starting to get closer, but I'm still missing the part
> > > where you say "if you do X, Y, and Z, I'll accept and merge the
> > > solution."  Can you be more explicit about what approach(es) you would
> > > be willing to accept upstream?
> >
> > Included with what is wanted/needed is an explanation as to my concerns
> > with the existing proposal.
> >
> > First we need to differentiate between kernel and uhserspace
> > requirements.  (The "snapshotting" design proposal intermixes them.)
> >
> > From the kernel persective, the Log Snapshotting Design proposal "B.1
> > Goals" is very nice, but once the measurement list can be trimmed it is
> > really irrelevant.  Userspace can do whatever it wants with the
> > measurement list records.  So instead of paying lip service to what
> > should be done, just call it as it is - trimming the measurement list.
> 
> Fair enough.  I personally think it is nice to have a brief discussion
> of how userspace might use a kernel feature, but if you prefer to drop
> that part of the design doc I doubt anyone will object very strongly.
> 
> > From the kernel perspective there needs to be a method of trimming N
> > number of records from the head of the measurement list.  In addition
> > to the existing securityfs "runtime measurement list",  defining a new
> > securityfs file containing the current count of in memory measurement
> > records would be beneficial.
> 
> I imagine that should be trivial to implement and I can't imagine
> there being any objection to that.
> 
> If we are going to have a record count, I imagine it would also be
> helpful to maintain a securityfs file with the total size (in bytes)
> of the in-memory measurement log.  In fact, I suspect this will
> probably be more useful for those who wish to manage the size of the
> measurement log.

A running number of bytes needed for carrying the measurement list
across kexec already exists.  This value would be affected when the
measurement list is trimmed.

...

> 
> > Defining other IMA securityfs files like
> > how many times the measurement list has been trimmed might be
> > beneficial as well.
> 
> I have no objection to that.  Would a total record count, i.e. a value
> that doesn't reset on a snapshot event, be more useful here?

<securityfs>/ima/runtime_measurements_count already exports the total
number of measurement records.

> 
> > Of course properly document the integrity
> > implications and repercussions of the new Kconfig that allows trimming
> > the measurement list.
> 
> Of course.
> 
> > Defining a simple "trim" marker measurement record would be a visual
> > indication that the measurement list has been trimmed.  I might even
> > have compared it to the "boot_aggregate".  However, the proposed marker
> > based on TPM PCRs requires pausing extending the measurement list.
> 
> ...
> 
> > Before defining a new critical-data record, we need to decide whether
> > it is really necessary or if it is redundant.  If we define a new
> > "critical-data" record, can it be defined such that it doesn't require
> > pausing extending the measurement list?  For example, a new simple
> > visual critical-data record could contain the number of records (e.g.
> > <securityfs>/ima/runtime_measurements_count) up to that point.
> 
> What if the snapshot_aggregate was a hash of the measurement log
> starting with either the boot_aggregate or the latest
> snapshot_aggregate and ending on the record before the new
> snapshot_aggregate?  The performance impact at snapshot time should be
> minimal as the hash can be incrementally updated as new records are
> added to the measurement list.  While the hash wouldn't capture the
> TPM state, it would allow some crude verification when reassembling
> the log.  If one could bear the cost of a TPM signing operation, the
> log digest could be signed by the TPM.

Other critical data is calculated, before calling
ima_measure_critical_data(), which adds the record to the measurement
list and extends the TPM PCR.

Signing the hash shouldn't be an issue if it behaves like other
critical data.

In addition to the hash, consider including other information in the
new critical data record (e.g. total number of measurement records, the
number of measurements included in the hash, the number of times the
measurement list was trimmed, etc). 

> 
> > The new critical-data record and trimming the measurement list should
> > be disjoint features.  If the first record after trimming the
> > measurement list should be the critical-data record, then trim the
> > measurement list up to that point.
> 
> I disagree about the snapshot_aggregate record being disjoint from the
> measurement log, but I suspect Tushar and Sush are willing to forgo
> the snapshot_aggregate if that is a blocker from your perspective.

> Once again, the main goal is the ability to manage the size of the
> measurement log; while having a snapshot_aggregate that can be used to
> establish a root of trust similar to the boot_aggregate is nice, it is
> not a MUST have.

The problem isn't the "snapshot_aggregate" critical data record per-se, 
but pausing adding measurements to the IMA measurement list and
extending the PCR to calculate it.

(Perhaps including other information, like the number of IMA
measurements before or after reading each TPM PCR read, would eliminate
the need for pausing the measurement list.)

> > From a userspace perspective, trimming the measurement list is a major
> change and will break existing attestation requests, unless the change
> is transparent.  Removing "snapshots"/"shards" will of course break
> attestation requests.  Refer to Stefan's suggestions:
>https://lore.kernel.org/linux-integrity/1ed2d72c-4cb2-48b3-bb0f-b0877fc1e9ca@linux.ibm.com/ > 
> You will note that Sush and I replied to Stefan two weeks ago.

Yes, I saw.  This might be a good place, as you suggested, "to have a
brief discussion
of how userspace might use a kernel feature".  Perhaps rename this
thread to differentiate it from the kernel design.

-- 
thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-28 12:09               ` Mimi Zohar
@ 2023-11-29  1:06                 ` Paul Moore
  2023-11-29  2:07                   ` Mimi Zohar
  0 siblings, 1 reply; 30+ messages in thread
From: Paul Moore @ 2023-11-29  1:06 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:

...

> > If we are going to have a record count, I imagine it would also be
> > helpful to maintain a securityfs file with the total size (in bytes)
> > of the in-memory measurement log.  In fact, I suspect this will
> > probably be more useful for those who wish to manage the size of the
> > measurement log.
>
> A running number of bytes needed for carrying the measurement list
> across kexec already exists.  This value would be affected when the
> measurement list is trimmed.

There we go, it should be trivial to export that information via securityfs.

> > > Defining other IMA securityfs files like
> > > how many times the measurement list has been trimmed might be
> > > beneficial as well.
> >
> > I have no objection to that.  Would a total record count, i.e. a value
> > that doesn't reset on a snapshot event, be more useful here?
>
> <securityfs>/ima/runtime_measurements_count already exports the total
> number of measurement records.

I guess the question is would you want 'runtime_measurements_count' to
reflect the current/trimmed log size or would you want it to reflect
the measurements since the initial cold boot?  Presumably we would
want to add another securityfs file to handle the case not covered by
'runtime_measurements_count'.

> > > Before defining a new critical-data record, we need to decide whether
> > > it is really necessary or if it is redundant.  If we define a new
> > > "critical-data" record, can it be defined such that it doesn't require
> > > pausing extending the measurement list?  For example, a new simple
> > > visual critical-data record could contain the number of records (e.g.
> > > <securityfs>/ima/runtime_measurements_count) up to that point.
> >
> > What if the snapshot_aggregate was a hash of the measurement log
> > starting with either the boot_aggregate or the latest
> > snapshot_aggregate and ending on the record before the new
> > snapshot_aggregate?  The performance impact at snapshot time should be
> > minimal as the hash can be incrementally updated as new records are
> > added to the measurement list.  While the hash wouldn't capture the
> > TPM state, it would allow some crude verification when reassembling
> > the log.  If one could bear the cost of a TPM signing operation, the
> > log digest could be signed by the TPM.
>
> Other critical data is calculated, before calling
> ima_measure_critical_data(), which adds the record to the measurement
> list and extends the TPM PCR.
>
> Signing the hash shouldn't be an issue if it behaves like other
> critical data.
>
> In addition to the hash, consider including other information in the
> new critical data record (e.g. total number of measurement records, the
> number of measurements included in the hash, the number of times the
> measurement list was trimmed, etc).

It would be nice if you could provide an explicit list of what you
would want hashed into a snapshot_aggregate record; the above is
close, but it is still a little hand-wavy.  I'm just trying to reduce
the back-n-forth :)

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-29  1:06                 ` Paul Moore
@ 2023-11-29  2:07                   ` Mimi Zohar
  2024-01-06 23:27                     ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2023-11-29  2:07 UTC (permalink / raw)
  To: Paul Moore
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> 
> ...
> 
> > > If we are going to have a record count, I imagine it would also be
> > > helpful to maintain a securityfs file with the total size (in bytes)
> > > of the in-memory measurement log.  In fact, I suspect this will
> > > probably be more useful for those who wish to manage the size of the
> > > measurement log.
> >
> > A running number of bytes needed for carrying the measurement list
> > across kexec already exists.  This value would be affected when the
> > measurement list is trimmed.
> 
> There we go, it should be trivial to export that information via securityfs.
> 
> > > > Defining other IMA securityfs files like
> > > > how many times the measurement list has been trimmed might be
> > > > beneficial as well.
> > >
> > > I have no objection to that.  Would a total record count, i.e. a value
> > > that doesn't reset on a snapshot event, be more useful here?
> >
> > <securityfs>/ima/runtime_measurements_count already exports the total
> > number of measurement records.
> 
> I guess the question is would you want 'runtime_measurements_count' to
> reflect the current/trimmed log size or would you want it to reflect
> hthe measurements since the initial cold boot?  Presumably we would
> want to add another securityfs file to handle the case not covered by
> 'runtime_measurements_count'.

Right.  <securityfs>/ima/runtime_measurements_count is defined as the
total number of measurements since boot.  When the measurement list is
carried across kexec, it is the number of measurements since cold boot.

A new securityfs file should be defined for the current number of in
kernel memory records.  Unless the measurement list has been trimmed,
this should be the same as the runtime_measurements_count.

> 
> > > > Before defining a new critical-data record, we need to decide whether
> > > > it is really necessary or if it is redundant.  If we define a new
> > > > "critical-data" record, can it be defined such that it doesn't require
> > > > pausing extending the measurement list?  For example, a new simple
> > > > visual critical-data record could contain the number of records (e.g.
> > > > <securityfs>/ima/runtime_measurements_count) up to that point.
> > >
> > > What if the snapshot_aggregate was a hash of the measurement log
> > > starting with either the boot_aggregate or the latest
> > > snapshot_aggregate and ending on the record before the new
> > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > minimal as the hash can be incrementally updated as new records are
> > > added to the measurement list.  While the hash wouldn't capture the
> > > TPM state, it would allow some crude verification when reassembling
> > > the log.  If one could bear the cost of a TPM signing operation, the
> > > log digest could be signed by the TPM.
> >
> > Other critical data is calculated, before calling
> > ima_measure_critical_data(), which adds the record to the measurement
> > list and extends the TPM PCR.
> >
> > Signing the hash shouldn't be an issue if it behaves like other
> > critical data.
> >
> > In addition to the hash, consider including other information in the
> > new critical data record (e.g. total number of measurement records, the
> > number of measurements included in the hash, the number of times the
> > measurement list was trimmed, etc).
> 
> It would be nice if you could provide an explicit list of what you
> would want hashed into a snapshot_aggregate record; the above is
> close, but it is still a little hand-wavy.  I'm just trying to reduce
> the back-n-forth :)

What is being defined here is the first IMA critical-data record, which
really requires some thought.  For ease of review, this new critical-
data record should be a separate patch set from trimming the
measurement list.

As I'm sure you're aware, SElinux defines two critical-data records.  
From security/selinux/ima.c:

        ima_measure_critical_data("selinux", "selinux-state",
                                  state_str, strlen(state_str), false,
                                  NULL, 0);

        ima_measure_critical_data("selinux", "selinux-policy-hash",
                                  policy, policy_len, true,
                                  NULL, 0);

-- 
thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-22 14:22         ` Paul Moore
  2023-11-27 17:07           ` Mimi Zohar
@ 2023-12-20 22:13           ` Ken Goldman
  2024-01-06 23:44             ` Paul Moore
  1 sibling, 1 reply; 30+ messages in thread
From: Ken Goldman @ 2023-12-20 22:13 UTC (permalink / raw)
  To: Paul Moore, Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, bhe, vgoyal, Dave Young, kexec@lists.infradead.org, jmorris,
	serge, James Bottomley, linux-security-module, Tyler Hicks,
	Lakshmi Ramasubramanian, Sush Shringarputale

I'm still struggling with the "new root of trust" concept.

Something - a user space agent, a third party, etc. - has to
retain the entire log from event 0, because a new verifier
needs all measurements.

Therefore, the snapshot aggregate seems redundant.  It has to
be verified to match the snapshotted events.

A redundancy is an attack surface.  A badly written verifier
might not do that verification, and this permits snapshotted
events to be forged. No aggregate means the verifier can't
make a mistake.

On 11/22/2023 9:22 AM, Paul Moore wrote:
> I believe the intent is to only pause the measurements while the
> snapshot_aggregate is generated, not for the duration of the entire
> snapshot process.  The purpose of the snapshot_aggregate is to
> establish a new root of trust, similar to the boot_aggregate, to help
> improve attestation performance.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-11-29  2:07                   ` Mimi Zohar
@ 2024-01-06 23:27                     ` Paul Moore
  2024-01-07 12:58                       ` Mimi Zohar
  0 siblings, 1 reply; 30+ messages in thread
From: Paul Moore @ 2024-01-06 23:27 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:

...

> > > > > Before defining a new critical-data record, we need to decide whether
> > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > "critical-data" record, can it be defined such that it doesn't require
> > > > > pausing extending the measurement list?  For example, a new simple
> > > > > visual critical-data record could contain the number of records (e.g.
> > > > > <securityfs>/ima/runtime_measurements_count) up to that point.
> > > >
> > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > starting with either the boot_aggregate or the latest
> > > > snapshot_aggregate and ending on the record before the new
> > > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > > minimal as the hash can be incrementally updated as new records are
> > > > added to the measurement list.  While the hash wouldn't capture the
> > > > TPM state, it would allow some crude verification when reassembling
> > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > log digest could be signed by the TPM.
> > >
> > > Other critical data is calculated, before calling
> > > ima_measure_critical_data(), which adds the record to the measurement
> > > list and extends the TPM PCR.
> > >
> > > Signing the hash shouldn't be an issue if it behaves like other
> > > critical data.
> > >
> > > In addition to the hash, consider including other information in the
> > > new critical data record (e.g. total number of measurement records, the
> > > number of measurements included in the hash, the number of times the
> > > measurement list was trimmed, etc).
> >
> > It would be nice if you could provide an explicit list of what you
> > would want hashed into a snapshot_aggregate record; the above is
> > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > the back-n-forth :)
>
> What is being defined here is the first IMA critical-data record, which
> really requires some thought.

My thinking has always been that taking a hash of the current
measurement log up to the snapshot point would be a nice
snapshot_aggregate measurement, but I'm not heavily invested in that.
To me it is more important that we find something we can all agree on,
perhaps reluctantly, so we can move forward with a solution.

> For ease of review, this new critical-
> data record should be a separate patch set from trimming the
> measurement list.

I see the two as linked, but if you prefer them as separate then so be
it.  Once again, the important part is to move forward with a
solution, I'm not overly bothered if it arrives in multiple pieces
instead of one.

> As I'm sure you're aware, SElinux defines two critical-data records.
> From security/selinux/ima.c:
>
>         ima_measure_critical_data("selinux", "selinux-state",
>                                   state_str, strlen(state_str), false,
>                                   NULL, 0);
>
>         ima_measure_critical_data("selinux", "selinux-policy-hash",
>                                   policy, policy_len, true,
>                                   NULL, 0);

Yep, but there is far more to this than SELinux.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2023-12-20 22:13           ` Ken Goldman
@ 2024-01-06 23:44             ` Paul Moore
  0 siblings, 0 replies; 30+ messages in thread
From: Paul Moore @ 2024-01-06 23:44 UTC (permalink / raw)
  To: Ken Goldman
  Cc: Mimi Zohar, Tushar Sugandhi, linux-integrity, peterhuewe,
	Jarkko Sakkinen, jgg, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Wed, Dec 20, 2023 at 5:14 PM Ken Goldman <kgold@linux.ibm.com> wrote:
>
> I'm still struggling with the "new root of trust" concept.
>
> Something - a user space agent, a third party, etc. - has to
> retain the entire log from event 0, because a new verifier
> needs all measurements.

[NOTE: a gentle reminder to please refrain from top-posting on Linux
kernel mailing lists, it is generally frowned upon and makes it
difficult to manage long running threads]

This is one of the reasons I have pushed to manage the snapshot, both
the trigger and the handling of the trimmed data, outside of the
kernel.  Setting aside the obvious limitations of kernel I/O, handling
the snapshot in userspace provides for a much richer set of options
when it comes to managing the snapshot and the
verification/attestation of the system.

> Therefore, the snapshot aggregate seems redundant.  It has to
> be verified to match the snapshotted events.

I can see a perspective where the snapshot_aggregate is theoretically
redundant, but I can also see at least one practical perspective where
a snapshot_aggregate could be used to simplify a remote attestation
with a sufficiently stateful attestation service.

> A redundancy is an attack surface.

Now that is an overly broad generalization, if we are going that
route, *everything* is an attack surface (and this arguably true
regardless, although a bit of an extreme statement).

> A badly written verifier
> might not do that verification, and this permits snapshotted
> events to be forged. No aggregate means the verifier can't
> make a mistake.

I would ask that you read your own comment again.  A poorly written
verifier is subject to any number of pitfalls and vulnerabilities,
regardless of a snapshot aggregate.  As a reminder, the snapshotting
mechanism has always been proposed as an opt-in mechanism, if one has
not implemented a proper snapshot-aware attestation mechanism then
they can simply refrain from taking a snapshot and reject all
attestation attempts using a snapshot.

> On 11/22/2023 9:22 AM, Paul Moore wrote:
> > I believe the intent is to only pause the measurements while the
> > snapshot_aggregate is generated, not for the duration of the entire
> > snapshot process.  The purpose of the snapshot_aggregate is to
> > establish a new root of trust, similar to the boot_aggregate, to help
> > improve attestation performance.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2024-01-06 23:27                     ` Paul Moore
@ 2024-01-07 12:58                       ` Mimi Zohar
  2024-01-08  2:58                         ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2024-01-07 12:58 UTC (permalink / raw)
  To: Paul Moore
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> 
> ...
> 
> > > > > > Before defining a new critical-data record, we need to decide whether
> > > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > > "critical-data" record, can it be defined such that it doesn't require
> > > > > > pausing extending the measurement list?  For example, a new simple
> > > > > > visual critical-data record could contain the number of records (e.g.
> > > > > > <securityfs>/ima/runtime_measurements_count) up to that point.
> > > > >
> > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > starting with either the boot_aggregate or the latest
> > > > > snapshot_aggregate and ending on the record before the new
> > > > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > > > minimal as the hash can be incrementally updated as new records are
> > > > > added to the measurement list.  While the hash wouldn't capture the
> > > > > TPM state, it would allow some crude verification when reassembling
> > > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > > log digest could be signed by the TPM.
> > > >
> > > > Other critical data is calculated, before calling
> > > > ima_measure_critical_data(), which adds the record to the measurement
> > > > list and extends the TPM PCR.
> > > >
> > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > critical data.
> > > >
> > > > In addition to the hash, consider including other information in the
> > > > new critical data record (e.g. total number of measurement records, the
> > > > number of measurements included in the hash, the number of times the
> > > > measurement list was trimmed, etc).
> > >
> > > It would be nice if you could provide an explicit list of what you
> > > would want hashed into a snapshot_aggregate record; the above is
> > > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > > the back-n-forth :)
> >
> > What is being defined here is the first IMA critical-data record, which
> > really requires some thought.
> 
> My thinking has always been that taking a hash of the current
> measurement log up to the snapshot point would be a nice
> snapshot_aggregate measurement, but I'm not heavily invested in that.
> To me it is more important that we find something we can all agree on,
> perhaps reluctantly, so we can move forward with a solution.
> 
> > For ease of review, this new critical-
> > data record should be a separate patch set from trimming the
> > measurement list.
> 
> I see the two as linked, but if you prefer them as separate then so be
> it.  Once again, the important part is to move forward with a
> solution, I'm not overly bothered if it arrives in multiple pieces
> instead of one.

Trimming the IMA measurement list could be used in conjunction with the new IMA
critical data record or independently.  Both options should be supported.

1. trim N number of records from the head of the in kernel IMA measurement list
2. intermittently include the new IMA critical data record based on some trigger
3. trim the measurement list up to the (first/last/Nth) IMA critical data record

Since the two features could be used independently of each other, there is no
reason to upstream them as a single patch set.  It just makes it harder to
review.

> 
> > As I'm sure you're aware, SElinux defines two critical-data records.
> > From security/selinux/ima.c:
> >
> >         ima_measure_critical_data("selinux", "selinux-state",
> >                                   state_str, strlen(state_str), false,
> >                                   NULL, 0);
> >
> >         ima_measure_critical_data("selinux", "selinux-policy-hash",
> >                                   policy, policy_len, true,
> >                                   NULL, 0);
> 
> Yep, but there is far more to this than SELinux.

Only if you conflate the two features. 

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2024-01-07 12:58                       ` Mimi Zohar
@ 2024-01-08  2:58                         ` Paul Moore
  2024-01-08 11:48                           ` Mimi Zohar
  0 siblings, 1 reply; 30+ messages in thread
From: Paul Moore @ 2024-01-08  2:58 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> >
> > ...
> >
> > > > > > > Before defining a new critical-data record, we need to decide whether
> > > > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > > > "critical-data" record, can it be defined such that it doesn't require
> > > > > > > pausing extending the measurement list?  For example, a new simple
> > > > > > > visual critical-data record could contain the number of records (e.g.
> > > > > > > <securityfs>/ima/runtime_measurements_count) up to that point.
> > > > > >
> > > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > > starting with either the boot_aggregate or the latest
> > > > > > snapshot_aggregate and ending on the record before the new
> > > > > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > > > > minimal as the hash can be incrementally updated as new records are
> > > > > > added to the measurement list.  While the hash wouldn't capture the
> > > > > > TPM state, it would allow some crude verification when reassembling
> > > > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > > > log digest could be signed by the TPM.
> > > > >
> > > > > Other critical data is calculated, before calling
> > > > > ima_measure_critical_data(), which adds the record to the measurement
> > > > > list and extends the TPM PCR.
> > > > >
> > > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > > critical data.
> > > > >
> > > > > In addition to the hash, consider including other information in the
> > > > > new critical data record (e.g. total number of measurement records, the
> > > > > number of measurements included in the hash, the number of times the
> > > > > measurement list was trimmed, etc).
> > > >
> > > > It would be nice if you could provide an explicit list of what you
> > > > would want hashed into a snapshot_aggregate record; the above is
> > > > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > > > the back-n-forth :)
> > >
> > > What is being defined here is the first IMA critical-data record, which
> > > really requires some thought.
> >
> > My thinking has always been that taking a hash of the current
> > measurement log up to the snapshot point would be a nice
> > snapshot_aggregate measurement, but I'm not heavily invested in that.
> > To me it is more important that we find something we can all agree on,
> > perhaps reluctantly, so we can move forward with a solution.
> >
> > > For ease of review, this new critical-
> > > data record should be a separate patch set from trimming the
> > > measurement list.
> >
> > I see the two as linked, but if you prefer them as separate then so be
> > it.  Once again, the important part is to move forward with a
> > solution, I'm not overly bothered if it arrives in multiple pieces
> > instead of one.
>
> Trimming the IMA measurement list could be used in conjunction with the new IMA
> critical data record or independently.  Both options should be supported.
>
> 1. trim N number of records from the head of the in kernel IMA measurement list
> 2. intermittently include the new IMA critical data record based on some trigger
> 3. trim the measurement list up to the (first/last/Nth) IMA critical data record
>
> Since the two features could be used independently of each other, there is no
> reason to upstream them as a single patch set.  It just makes it harder to
> review.

I don't see much point in recording a snapshot aggregate if you aren't
doing a snapshot, but it's not harmful in any way, so sure, go for it.
Like I said earlier, as long as the functionality is there, I don't
think anyone cares too much how it gets into the kernel (although
Tushar and Sush should comment from the perspective).

> > > As I'm sure you're aware, SElinux defines two critical-data records.
> > > From security/selinux/ima.c:
> > >
> > >         ima_measure_critical_data("selinux", "selinux-state",
> > >                                   state_str, strlen(state_str), false,
> > >                                   NULL, 0);
> > >
> > >         ima_measure_critical_data("selinux", "selinux-policy-hash",
> > >                                   policy, policy_len, true,
> > >                                   NULL, 0);
> >
> > Yep, but there is far more to this than SELinux.
>
> Only if you conflate the two features.

If that is a clever retort, you'll need to elaborate a bit as it
doesn't make much sense to me.  The IMA measurement log snapshot is
independent from SELinux; the only connection is that yes, IMA does
measure SELinux "things" but that is no different from any other
system attribute that is measured by IMA.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2024-01-08  2:58                         ` Paul Moore
@ 2024-01-08 11:48                           ` Mimi Zohar
  2024-01-08 17:15                             ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2024-01-08 11:48 UTC (permalink / raw)
  To: Paul Moore
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Sun, 2024-01-07 at 21:58 -0500, Paul Moore wrote:
> On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> > > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> > >
> > > ...
> > >
> > > > > > > > Before defining a new critical-data record, we need to decide whether
> > > > > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > > > > "critical-data" record, can it be defined such that it doesn't require
> > > > > > > > pausing extending the measurement list?  For example, a new simple
> > > > > > > > visual critical-data record could contain the number of records (e.g.
> > > > > > > > <securityfs>/ima/runtime_measurements_count) up to that point.
> > > > > > >
> > > > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > > > starting with either the boot_aggregate or the latest
> > > > > > > snapshot_aggregate and ending on the record before the new
> > > > > > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > > > > > minimal as the hash can be incrementally updated as new records are
> > > > > > > added to the measurement list.  While the hash wouldn't capture the
> > > > > > > TPM state, it would allow some crude verification when reassembling
> > > > > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > > > > log digest could be signed by the TPM.
> > > > > >
> > > > > > Other critical data is calculated, before calling
> > > > > > ima_measure_critical_data(), which adds the record to the measurement
> > > > > > list and extends the TPM PCR.
> > > > > >
> > > > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > > > critical data.
> > > > > >
> > > > > > In addition to the hash, consider including other information in the
> > > > > > new critical data record (e.g. total number of measurement records, the
> > > > > > number of measurements included in the hash, the number of times the
> > > > > > measurement list was trimmed, etc).
> > > > >
> > > > > It would be nice if you could provide an explicit list of what you
> > > > > would want hashed into a snapshot_aggregate record; the above is
> > > > > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > > > > the back-n-forth :)
> > > >
> > > > What is being defined here is the first IMA critical-data record, which
> > > > really requires some thought.
> > >
> > > My thinking has always been that taking a hash of the current
> > > measurement log up to the snapshot point would be a nice
> > > snapshot_aggregate measurement, but I'm not heavily invested in that.
> > > To me it is more important that we find something we can all agree on,
> > > perhaps reluctantly, so we can move forward with a solution.
> > >
> > > > For ease of review, this new critical-
> > > > data record should be a separate patch set from trimming the
> > > > measurement list.
> > >
> > > I see the two as linked, but if you prefer them as separate then so be
> > > it.  Once again, the important part is to move forward with a
> > > solution, I'm not overly bothered if it arrives in multiple pieces
> > > instead of one.
> >
> > Trimming the IMA measurement list could be used in conjunction with the new IMA
> > critical data record or independently.  Both options should be supported.
> >
> > 1. trim N number of records from the head of the in kernel IMA measurement list
> > 2. intermittently include the new IMA critical data record based on some trigger
> > 3. trim the measurement list up to the (first/last/Nth) IMA critical data record
> >
> > Since the two features could be used independently of each other, there is no
> > reason to upstream them as a single patch set.  It just makes it harder to
> > review.
> 
> I don't see much point in recording a snapshot aggregate if you aren't
> doing a snapshot, but it's not harmful in any way, so sure, go for it.
> Like I said earlier, as long as the functionality is there, I don't
> think anyone cares too much how it gets into the kernel (although
> Tushar and Sush should comment from the perspective).

Paul, there are two features: 
- trimming the measurement list
- defining and including an IMA critical data record

The original design doc combined these two features making them an "atomic"
operation and referred to it as a snapshot.  At the time the term "snapshot" was
an appropriate term for the IMA critical record.  Now not so much.

These are two separate, independent features.  Trimming the measurement list
should not be dependent on the IMA critical data record.  The IMA critical data
record should not be dependent on trimming the measurement list.  Trimming the
measurement list up to the (first/last/Nth) critical data record should be
optional.

> 
> > > > As I'm sure you're aware, SElinux defines two critical-data records.
> > > > From security/selinux/ima.c:
> > > >
> > > >         ima_measure_critical_data("selinux", "selinux-state",
> > > >                                   state_str, strlen(state_str), false,
> > > >                                   NULL, 0);
> > > >
> > > >         ima_measure_critical_data("selinux", "selinux-policy-hash",
> > > >                                   policy, policy_len, true,
> > > >                                   NULL, 0);
> > >
> > > Yep, but there is far more to this than SELinux.
> >
> > Only if you conflate the two features.
> 
> If that is a clever retort, you'll need to elaborate a bit as it
> doesn't make much sense to me.  The IMA measurement log snapshot is
> independent from SELinux; the only connection is that yes, IMA does
> measure SELinux "things" but that is no different from any other
> system attribute that is measured by IMA.

The IMA critical data record should not be that different or more difficult,
than the SELinux critical data record.  Only if you conflate the two features
being discussed - trimming the IMA measurement list and the IMA critical data
record - does it become "far more".

--  
Mimi  




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC V2] IMA Log Snapshotting Design Proposal
  2024-01-08 11:48                           ` Mimi Zohar
@ 2024-01-08 17:15                             ` Paul Moore
  0 siblings, 0 replies; 30+ messages in thread
From: Paul Moore @ 2024-01-08 17:15 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Tushar Sugandhi, linux-integrity, peterhuewe, Jarkko Sakkinen,
	jgg, Ken Goldman, bhe, vgoyal, Dave Young,
	kexec@lists.infradead.org, jmorris, serge, James Bottomley,
	linux-security-module, Tyler Hicks, Lakshmi Ramasubramanian,
	Sush Shringarputale

On Mon, Jan 8, 2024 at 6:48 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Sun, 2024-01-07 at 21:58 -0500, Paul Moore wrote:
> > On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> > > > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> > > >
> > > > ...
> > > >
> > > > > > > > > Before defining a new critical-data record, we need to decide whether
> > > > > > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > > > > > "critical-data" record, can it be defined such that it doesn't require
> > > > > > > > > pausing extending the measurement list?  For example, a new simple
> > > > > > > > > visual critical-data record could contain the number of records (e.g.
> > > > > > > > > <securityfs>/ima/runtime_measurements_count) up to that point.
> > > > > > > >
> > > > > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > > > > starting with either the boot_aggregate or the latest
> > > > > > > > snapshot_aggregate and ending on the record before the new
> > > > > > > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > > > > > > minimal as the hash can be incrementally updated as new records are
> > > > > > > > added to the measurement list.  While the hash wouldn't capture the
> > > > > > > > TPM state, it would allow some crude verification when reassembling
> > > > > > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > > > > > log digest could be signed by the TPM.
> > > > > > >
> > > > > > > Other critical data is calculated, before calling
> > > > > > > ima_measure_critical_data(), which adds the record to the measurement
> > > > > > > list and extends the TPM PCR.
> > > > > > >
> > > > > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > > > > critical data.
> > > > > > >
> > > > > > > In addition to the hash, consider including other information in the
> > > > > > > new critical data record (e.g. total number of measurement records, the
> > > > > > > number of measurements included in the hash, the number of times the
> > > > > > > measurement list was trimmed, etc).
> > > > > >
> > > > > > It would be nice if you could provide an explicit list of what you
> > > > > > would want hashed into a snapshot_aggregate record; the above is
> > > > > > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > > > > > the back-n-forth :)
> > > > >
> > > > > What is being defined here is the first IMA critical-data record, which
> > > > > really requires some thought.
> > > >
> > > > My thinking has always been that taking a hash of the current
> > > > measurement log up to the snapshot point would be a nice
> > > > snapshot_aggregate measurement, but I'm not heavily invested in that.
> > > > To me it is more important that we find something we can all agree on,
> > > > perhaps reluctantly, so we can move forward with a solution.
> > > >
> > > > > For ease of review, this new critical-
> > > > > data record should be a separate patch set from trimming the
> > > > > measurement list.
> > > >
> > > > I see the two as linked, but if you prefer them as separate then so be
> > > > it.  Once again, the important part is to move forward with a
> > > > solution, I'm not overly bothered if it arrives in multiple pieces
> > > > instead of one.
> > >
> > > Trimming the IMA measurement list could be used in conjunction with the new IMA
> > > critical data record or independently.  Both options should be supported.
> > >
> > > 1. trim N number of records from the head of the in kernel IMA measurement list
> > > 2. intermittently include the new IMA critical data record based on some trigger
> > > 3. trim the measurement list up to the (first/last/Nth) IMA critical data record
> > >
> > > Since the two features could be used independently of each other, there is no
> > > reason to upstream them as a single patch set.  It just makes it harder to
> > > review.
> >
> > I don't see much point in recording a snapshot aggregate if you aren't
> > doing a snapshot, but it's not harmful in any way, so sure, go for it.
> > Like I said earlier, as long as the functionality is there, I don't
> > think anyone cares too much how it gets into the kernel (although
> > Tushar and Sush should comment from the perspective).
>
> Paul, there are two features:
> - trimming the measurement list
> - defining and including an IMA critical data record
>
> The original design doc combined these two features making them an "atomic"
> operation and referred to it as a snapshot.  At the time the term "snapshot" was
> an appropriate term for the IMA critical record.  Now not so much.
>
> These are two separate, independent features.  Trimming the measurement list
> should not be dependent on the IMA critical data record.  The IMA critical data
> record should not be dependent on trimming the measurement list.  Trimming the
> measurement list up to the (first/last/Nth) critical data record should be
> optional.

Mimi, do you keep missing the part in my replies where I mention that
I don't really care as long as the trimming and aggregate both make it
into the kernel?  I don't agree with your assertion that the two
pieces of functionality are independent, but it doesn't really matter
as long as the functionality is in place, userspace can be made to do
what Tushar and Sush need to do.

> > > > > As I'm sure you're aware, SElinux defines two critical-data records.
> > > > > From security/selinux/ima.c:
> > > > >
> > > > >         ima_measure_critical_data("selinux", "selinux-state",
> > > > >                                   state_str, strlen(state_str), false,
> > > > >                                   NULL, 0);
> > > > >
> > > > >         ima_measure_critical_data("selinux", "selinux-policy-hash",
> > > > >                                   policy, policy_len, true,
> > > > >                                   NULL, 0);
> > > >
> > > > Yep, but there is far more to this than SELinux.
> > >
> > > Only if you conflate the two features.
> >
> > If that is a clever retort, you'll need to elaborate a bit as it
> > doesn't make much sense to me.  The IMA measurement log snapshot is
> > independent from SELinux; the only connection is that yes, IMA does
> > measure SELinux "things" but that is no different from any other
> > system attribute that is measured by IMA.
>
> The IMA critical data record should not be that different or more difficult,
> than the SELinux critical data record.  Only if you conflate the two features
> being discussed - trimming the IMA measurement list and the IMA critical data
> record - does it become "far more".

I still don't understand why you've brought SELinux into this, it
makes no sense from my perspective as the two are "separate,
independent features".  Are you trying to draw some odd parallel
between snapshotting and IMA/SELinux?  I guess I'd suggest just
dropping this comparison and focusing on more concrete things like
"the aggregate should contain X, Y, and Z", etc. as that is something
we can all understand and use as a starting point to continue to make
forward progress.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-01-08 17:15 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-19 18:49 [RFC V2] IMA Log Snapshotting Design Proposal Tushar Sugandhi
2023-10-31 18:37 ` Ken Goldman
2023-11-13 18:14   ` Sush Shringarputale
2023-10-31 19:15 ` Mimi Zohar
2023-11-16 22:28   ` Paul Moore
2023-11-22  1:01     ` Tushar Sugandhi
2023-11-22  1:18       ` Mimi Zohar
2023-11-22  4:27     ` Paul Moore
2023-11-22 13:18       ` Mimi Zohar
2023-11-22 14:22         ` Paul Moore
2023-11-27 17:07           ` Mimi Zohar
2023-11-27 22:16             ` Paul Moore
2023-11-28 12:09               ` Mimi Zohar
2023-11-29  1:06                 ` Paul Moore
2023-11-29  2:07                   ` Mimi Zohar
2024-01-06 23:27                     ` Paul Moore
2024-01-07 12:58                       ` Mimi Zohar
2024-01-08  2:58                         ` Paul Moore
2024-01-08 11:48                           ` Mimi Zohar
2024-01-08 17:15                             ` Paul Moore
2023-12-20 22:13           ` Ken Goldman
2024-01-06 23:44             ` Paul Moore
2023-11-13 18:59 ` Stefan Berger
2023-11-14 18:36   ` Sush Shringarputale
2023-11-14 18:58     ` Stefan Berger
2023-11-16 22:07       ` Paul Moore
2023-11-16 22:41         ` Stefan Berger
2023-11-16 22:56           ` Paul Moore
2023-11-17 22:41             ` Sush Shringarputale
2023-11-20 20:03         ` Tushar Sugandhi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).