Skip to content

F2: Module Executor -- apcore Module Trait for CLI Subprocess Execution

Field Value
Feature ID F2
Tech Design Section 5.6
Priority P0 (Core)
Dependencies F1 (Scanner Adapter), F6 (Error Migration)
Depended On By F4 (MCP Server), F5 (Governance)
New Files src/module/mod.rs, src/module/cli_module.rs, src/module/executor.rs
Deleted Files src/executor/mod.rs (absorbed)
Estimated LOC ~500
Estimated Tests ~25

1. Purpose

Implement the apcore Module trait for CLI subprocess execution. Each scanned CLI command becomes a CliModule that can be registered in an apcore Registry, executed through an apcore Executor, and participate in middleware chains. This is the central integration point that connects apexe's scanning output to the apcore runtime.


2. Module Structure

2.1 src/module/mod.rs

pub mod cli_module;
pub mod executor;

pub use cli_module::CliModule;

2.2 src/module/cli_module.rs -- CliModule

use std::sync::Arc;
use apcore::{Context, Module, ModuleAnnotations, ModuleError, SharedData};
use apcore_toolkit::ScannedModule;
use serde_json::Value;

use crate::governance::{AuditManager, SandboxManager};

/// An apcore Module implementation that executes a CLI command as a subprocess.
pub struct CliModule {
    /// Unique module identifier (e.g., "cli.git.commit").
    module_id: String,
    /// Human-readable description.
    description: String,
    /// JSON Schema for valid inputs.
    input_schema: Value,
    /// JSON Schema for expected outputs.
    output_schema: Value,
    /// Module behavioral annotations.
    annotations: ModuleAnnotations,
    /// Absolute path to the CLI binary.
    binary_path: String,
    /// Command parts after the binary (e.g., ["container", "ls"] for docker).
    command_parts: Vec<String>,
    /// Flag to enable structured JSON output (e.g., "--format json").
    json_flag: Option<String>,
    /// Subprocess timeout in milliseconds.
    timeout_ms: u64,
    /// Optional sandbox for subprocess isolation.
    sandbox: Option<Arc<SandboxManager>>,
    /// Optional audit logger.
    audit: Option<Arc<AuditManager>>,
}

2.3 Construction Methods

impl CliModule {
    /// Create a CliModule from a ScannedModule and runtime dependencies.
    ///
    /// Parses the `target` field (format: "exec://{binary_path} {command_parts}")
    /// and extracts the json_flag from metadata.
    pub fn from_scanned(
        module: &ScannedModule,
        timeout_ms: u64,
        sandbox: Option<Arc<SandboxManager>>,
        audit: Option<Arc<AuditManager>>,
    ) -> Result<Self, ModuleError>;

    /// Create a CliModule directly with all parameters.
    pub fn new(
        module_id: String,
        description: String,
        input_schema: Value,
        output_schema: Value,
        annotations: ModuleAnnotations,
        binary_path: String,
        command_parts: Vec<String>,
        json_flag: Option<String>,
        timeout_ms: u64,
        sandbox: Option<Arc<SandboxManager>>,
        audit: Option<Arc<AuditManager>>,
    ) -> Self;
}

2.4 Module Trait Implementation

#[async_trait::async_trait]
impl Module for CliModule {
    /// Execute the CLI command with the given input.
    ///
    /// Steps:
    /// 1. Extract trace_id from Context for correlation.
    /// 2. Build command arguments from input JSON (see Section 3.1).
    /// 3. If sandbox is enabled, delegate to SandboxManager.
    /// 4. Otherwise, spawn_blocking for subprocess execution (see Section 3.2).
    /// 5. Parse subprocess output into JSON result (see Section 3.3).
    /// 6. If audit is enabled, log the execution.
    /// 7. Return result or ModuleError.
    async fn execute(
        &self,
        ctx: &Context<SharedData>,
        input: Value,
    ) -> Result<Value, ModuleError>;

    /// Return the input JSON Schema.
    fn input_schema(&self) -> Option<Value> {
        Some(self.input_schema.clone())
    }

    /// Return the output JSON Schema.
    fn output_schema(&self) -> Option<Value> {
        Some(self.output_schema.clone())
    }

    /// Return the module description.
    fn description(&self) -> &str {
        &self.description
    }

    /// Pre-execution validation: check for shell injection in input values.
    ///
    /// Returns Err(ModuleError) with ErrorCode::ValidationFailed if
    /// any input value contains shell metacharacters.
    fn preflight(&self, input: &Value) -> Result<(), ModuleError>;
}

3. Execution Logic

3.1 Argument Building

Extracted from current src/executor/mod.rs execute_cli() function.

// src/module/executor.rs

/// Characters prohibited in command arguments to prevent shell injection.
const SHELL_INJECTION_CHARS: &[char] = &[';', '|', '&', '$', '`', '\\', '\'', '"', '\n', '\r'];

/// Build a Vec<String> of command-line arguments from JSON input.
///
/// Rules (preserved from v0.1.x):
/// - Null values: skipped
/// - Boolean true: append --{key} (underscores become hyphens)
/// - Boolean false: omit
/// - Array values: append --{key} {item} for each item
/// - Other values: append --{key} {value}
///
/// All string values are validated against SHELL_INJECTION_CHARS.
pub fn build_arguments(
    kwargs: &serde_json::Map<String, Value>,
) -> Result<Vec<String>, ModuleError>;

/// Validate a single string value contains no shell injection characters.
pub fn validate_no_injection(param_name: &str, value: &str) -> Result<(), ModuleError>;

/// Convert a JSON value to its string representation for command arguments.
fn json_value_to_string(value: &Value) -> String;

3.2 Subprocess Execution

// src/module/executor.rs

/// Execute a CLI subprocess and return raw output.
///
/// Uses tokio::task::spawn_blocking to avoid blocking the async executor.
/// Applies timeout via tokio::time::timeout.
pub async fn execute_subprocess(
    binary_path: &str,
    args: &[String],
    json_flag: Option<&str>,
    working_dir: Option<&str>,
    timeout_ms: u64,
) -> Result<SubprocessOutput, ModuleError>;

/// Raw subprocess output.
pub struct SubprocessOutput {
    pub stdout: String,
    pub stderr: String,
    pub exit_code: i32,
}

Key changes from v0.1.x: - Now uses tokio::task::spawn_blocking instead of synchronous Command::output(). - Now uses tokio::time::timeout instead of ignoring the _apexe_timeout parameter. - Returns ModuleError instead of ApexeError.

3.3 Output Parsing

// Inside CliModule::execute()

fn parse_output(output: SubprocessOutput, json_flag: &Option<String>) -> Value {
    let mut result = serde_json::Map::new();
    result.insert("stdout".into(), Value::String(output.stdout.clone()));
    result.insert("stderr".into(), Value::String(output.stderr));
    result.insert("exit_code".into(), Value::Number(output.exit_code.into()));

    // Attempt JSON parsing if json_flag was set
    if json_flag.is_some() && !output.stdout.trim().is_empty() {
        if let Ok(parsed) = serde_json::from_str::<Value>(&output.stdout) {
            result.insert("json_output".into(), parsed);
        }
    }

    Value::Object(result)
}

3.4 Preflight Validation

// Inside CliModule::preflight()

fn preflight(&self, input: &Value) -> Result<(), ModuleError> {
    if let Value::Object(map) = input {
        for (key, value) in map {
            match value {
                Value::String(s) => validate_no_injection(key, s)?,
                Value::Array(items) => {
                    for item in items {
                        if let Value::String(s) = item {
                            validate_no_injection(key, s)?;
                        }
                    }
                }
                _ => {} // Non-string values cannot contain injection
            }
        }
    }
    Ok(())
}

4. Target Field Parsing

The ScannedModule.target field encodes the binary path and command:

Format: exec://{binary_path} {command_part_1} {command_part_2} ...
Example: exec:///usr/bin/git commit
Example: exec:///usr/bin/docker container ls
Example: exec:///usr/local/bin/ffmpeg

Parsing logic in CliModule::from_scanned():

fn parse_target(target: &str) -> Result<(String, Vec<String>), ModuleError> {
    let stripped = target.strip_prefix("exec://")
        .ok_or_else(|| ModuleError {
            code: ErrorCode::ValidationFailed,
            message: format!("Invalid target format: {}", target),
            ..Default::default()
        })?;

    let parts: Vec<&str> = stripped.split_whitespace().collect();
    if parts.is_empty() {
        return Err(ModuleError { code: ErrorCode::ValidationFailed, .. });
    }

    let binary_path = parts[0].to_string();
    let command_parts = parts[1..].iter().map(|s| s.to_string()).collect();
    Ok((binary_path, command_parts))
}

5. Test Scenarios

5.1 Construction Tests

Test Name Scenario Expected
test_cli_module_from_scanned_basic Valid ScannedModule CliModule created with correct fields
test_cli_module_from_scanned_no_json_flag Module without json_flag metadata json_flag = None
test_cli_module_from_scanned_invalid_target target = "invalid" Err(ModuleError) with ValidationFailed
test_cli_module_from_scanned_empty_target target = "exec://" Err(ModuleError)
test_cli_module_new_direct All parameters provided Fields match inputs

5.2 Trait Method Tests

Test Name Scenario Expected
test_cli_module_input_schema_returns_some Module with schema Some(schema)
test_cli_module_output_schema_returns_some Module with schema Some(schema)
test_cli_module_description_returns_string Module with desc Non-empty string

5.3 Argument Building Tests

Test Name Scenario Expected
test_build_arguments_string_value {"file": "test.txt"} ["--file", "test.txt"]
test_build_arguments_boolean_true {"all": true} ["--all"]
test_build_arguments_boolean_false {"all": false} [] (omitted)
test_build_arguments_null_skipped {"x": null} []
test_build_arguments_array_values {"include": ["a","b"]} ["--include", "a", "--include", "b"]
test_build_arguments_underscore_to_hyphen {"no_cache": true} ["--no-cache"]
test_build_arguments_integer_value {"count": 5} ["--count", "5"]
test_build_arguments_injection_blocked {"msg": "hi; rm"} Err(ModuleError)

5.4 Execution Tests

Test Name Scenario Expected
test_execute_echo_returns_stdout Execute echo hello stdout contains "hello", exit_code = 0
test_execute_false_nonzero_exit Execute false exit_code != 0
test_execute_json_output_parsed Echo valid JSON with json_flag json_output key present
test_execute_timeout_returns_error Command that hangs, timeout 1ms Err with Timeout error code
test_execute_nonexistent_binary Binary = "/nonexistent" Err with InternalError

5.5 Preflight Tests

Test Name Scenario Expected
test_preflight_clean_input_passes {"file": "/path/to/file"} Ok(())
test_preflight_injection_semicolon {"arg": "a;b"} Err(ValidationFailed)
test_preflight_injection_pipe {"arg": "a|b"} Err(ValidationFailed)
test_preflight_injection_in_array {"args": ["ok", "bad$"]} Err(ValidationFailed)
test_preflight_non_string_passes {"count": 5} Ok(())

6. Migration from v0.1.x

Code Preserved

The following logic is extracted from src/executor/mod.rs into src/module/executor.rs: - SHELL_INJECTION_CHARS constant - validate_no_injection() function - json_value_to_string() function - Argument building loop from execute_cli()

Code Changed

  • execute_cli() is split into build_arguments() + execute_subprocess().
  • Timeout is now enforced via tokio::time::timeout (was ignored in v0.1.x).
  • Error types change from ApexeError to ModuleError (uses F6 conversions).
  • Subprocess runs via tokio::task::spawn_blocking (was synchronous).

Code Deleted

  • src/executor/mod.rs is deleted entirely. Its logic lives in src/module/executor.rs and src/module/cli_module.rs.

7. Thread Safety

CliModule is Send + Sync because: - All fields are either owned values or Arc-wrapped. - execute() is async and uses spawn_blocking for the subprocess call. - No interior mutability (&self only in all methods).

This is required for registration in apcore's Registry and use in async handlers.