[Low] OS-Level Sandboxing

## Summary
Implement operating system-level sandboxing to isolate AI-executed commands, preventing accidental or malicious access to sensitive files, network resources, or system operations.

## Current State
Amplifier CLI executes bash commands with the same permissions as the user:
- Full filesystem access
- Full network access
- Can modify system files
- Can access sensitive data (~/.ssh, credentials, etc.)

This is a security risk, especially for:
- Untrusted prompts
- Automated/unattended operation
- Multi-tenant environments

## Proposed Implementation

### 1. Sandbox Modes

| Mode | Description | Use Case |
|------|-------------|----------|
| `off` | No sandboxing (current behavior) | Trusted, interactive use |
| `permissive` | Log violations but allow | Development, auditing |
| `standard` | Block dangerous operations | Normal use |
| `strict` | Minimal access only | Untrusted prompts |

### 2. Platform-Specific Backends

#### macOS: `sandbox-exec`
Uses Apple's built-in Seatbelt sandbox:

```scheme
;; amplifier-sandbox.sb
(version 1)
(deny default)

;; Allow reading most files
(allow file-read*)

;; Allow writing only to project directory
(allow file-write*
    (subpath "${PROJECT_DIR}")
    (subpath "${TMPDIR}"))

;; Block sensitive paths
(deny file-read*
    (subpath "${HOME}/.ssh")
    (subpath "${HOME}/.gnupg")
    (subpath "${HOME}/.aws")
    (regex #".*\.env$"))

;; Network access (optional)
(allow network-outbound
    (remote tcp "localhost:*")
    (remote tcp "*:80")
    (remote tcp "*:443"))
```

#### Linux: `bubblewrap` (bwrap)
Uses Linux namespaces for isolation:

```bash
bwrap \
    --ro-bind /usr /usr \
    --ro-bind /bin /bin \
    --ro-bind /lib /lib \
    --ro-bind /lib64 /lib64 \
    --bind "${PROJECT_DIR}" "${PROJECT_DIR}" \
    --tmpfs /tmp \
    --unshare-net \
    --die-with-parent \
    -- bash -c "${COMMAND}"
```

#### Windows: Job Objects + Restricted Tokens
```python
# Use Windows Job Objects for resource limits
# Use restricted tokens for privilege reduction
import win32job
import win32security
```

### 3. Sandbox Configuration

```yaml
sandbox:
  enabled: true
  mode: "standard"           # off | permissive | standard | strict
  
  filesystem:
    # Paths to allow read access
    read:
      - "${PROJECT_DIR}"
      - "/usr"
      - "/bin"
      - "${HOME}/.config/amplifier"
      
    # Paths to allow write access
    write:
      - "${PROJECT_DIR}"
      - "${TMPDIR}"
      
    # Paths to explicitly deny (even if parent allowed)
    deny:
      - "${HOME}/.ssh"
      - "${HOME}/.gnupg"
      - "${HOME}/.aws"
      - "**/.env"
      - "**/.env.*"
      - "**/secrets.*"
      
  network:
    enabled: true            # Allow network access?
    allow:
      - "localhost:*"        # Local services
      - "*:80"               # HTTP
      - "*:443"              # HTTPS
    deny:
      - "169.254.169.254:*"  # AWS metadata
      - "metadata.google.*"  # GCP metadata
      
  resources:
    max_processes: 100       # Process limit
    max_memory_mb: 2048      # Memory limit
    max_cpu_seconds: 300     # CPU time limit
    max_file_size_mb: 100    # Max file write size
```

### 4. Sandbox Profiles

Predefined profiles for common scenarios:

```yaml
sandbox:
  profile: "development"  # Uses predefined development profile
```

| Profile | Filesystem | Network | Resources |
|---------|------------|---------|-----------|
| `minimal` | Project only, read-only | None | Tight limits |
| `development` | Project R/W, system read | Full | Normal limits |
| `ci` | Project R/W | Limited | Normal limits |
| `untrusted` | Project read, temp write | None | Tight limits |

### 5. New Module Structure

```
src/amplifier_app_cli/sandbox/
├── __init__.py
├── manager.py            # Main sandbox manager
├── config.py             # Configuration parsing
├── profiles.py           # Predefined profiles
├── backends/
│   ├── __init__.py
│   ├── base.py           # Abstract backend
│   ├── macos.py          # sandbox-exec backend
│   ├── linux.py          # bubblewrap backend
│   ├── windows.py        # Job Objects backend
│   └── fallback.py       # No-op fallback
└── policy.py             # Policy file generation
```

### 6. Core Interfaces

```python
from abc import ABC, abstractmethod
from dataclasses import dataclass
from pathlib import Path
from typing import Any

@dataclass
class SandboxConfig:
    mode: Literal["off", "permissive", "standard", "strict"]
    filesystem_read: list[str]
    filesystem_write: list[str]
    filesystem_deny: list[str]
    network_enabled: bool
    network_allow: list[str]
    network_deny: list[str]
    max_processes: int
    max_memory_mb: int
    max_cpu_seconds: int

@dataclass
class SandboxResult:
    exit_code: int
    stdout: str
    stderr: str
    violations: list[str]  # Logged violations (permissive mode)
    resource_usage: dict[str, Any]

class SandboxBackend(ABC):
    """Abstract base for platform-specific sandboxing."""
    
    @abstractmethod
    def is_available(self) -> bool:
        """Check if this backend is available on current system."""
        ...
    
    @abstractmethod
    def create_policy(self, config: SandboxConfig) -> str:
        """Generate platform-specific policy."""
        ...
    
    @abstractmethod
    async def execute(
        self,
        command: str,
        config: SandboxConfig,
        cwd: Path | None = None,
        env: dict[str, str] | None = None
    ) -> SandboxResult:
        """Execute command in sandbox."""
        ...

class MacOSSandbox(SandboxBackend):
    """macOS sandbox-exec backend."""
    
    def is_available(self) -> bool:
        return sys.platform == "darwin"
    
    def create_policy(self, config: SandboxConfig) -> str:
        """Generate Seatbelt policy (.sb file)."""
        ...
    
    async def execute(
        self,
        command: str,
        config: SandboxConfig,
        **kwargs
    ) -> SandboxResult:
        """Run command with sandbox-exec."""
        policy_file = self._write_policy(config)
        proc = await asyncio.create_subprocess_exec(
            "sandbox-exec", "-f", policy_file, "bash", "-c", command,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE,
            **kwargs
        )
        ...

class LinuxSandbox(SandboxBackend):
    """Linux bubblewrap backend."""
    
    def is_available(self) -> bool:
        return sys.platform == "linux" and shutil.which("bwrap")
    
    async def execute(
        self,
        command: str,
        config: SandboxConfig,
        **kwargs
    ) -> SandboxResult:
        """Run command with bwrap."""
        bwrap_args = self._build_bwrap_args(config)
        proc = await asyncio.create_subprocess_exec(
            "bwrap", *bwrap_args, "--", "bash", "-c", command,
            ...
        )
        ...

class SandboxManager:
    """Main interface for sandboxed execution."""
    
    def __init__(self, config: SandboxConfig):
        self.config = config
        self.backend = self._select_backend()
    
    def _select_backend(self) -> SandboxBackend:
        """Select best available backend for current platform."""
        backends = [MacOSSandbox(), LinuxSandbox(), WindowsSandbox()]
        for backend in backends:
            if backend.is_available():
                return backend
        return FallbackSandbox()  # No-op, just logs warning
    
    async def execute(
        self,
        command: str,
        cwd: Path | None = None,
        env: dict[str, str] | None = None
    ) -> SandboxResult:
        """Execute command in sandbox."""
        if self.config.mode == "off":
            return await self._execute_unsandboxed(command, cwd, env)
        return await self.backend.execute(command, self.config, cwd, env)
```

### 7. Integration with Bash Tool

Modify existing Bash tool to use sandbox:

```python
class BashTool:
    def __init__(self, sandbox_manager: SandboxManager): ...
    
    async def execute(self, command: str) -> ToolResult:
        result = await self.sandbox_manager.execute(command)
        
        if result.violations:
            # Log violations (permissive mode)
            logger.warning(f"Sandbox violations: {result.violations}")
        
        return ToolResult(
            output=result.stdout,
            error=result.stderr if result.exit_code != 0 else None,
            exit_code=result.exit_code
        )
```

### 8. Violation Handling

```yaml
sandbox:
  on_violation: "block"    # block | warn | log
```

| Action | Behavior |
|--------|----------|
| `block` | Deny operation, return error |
| `warn` | Allow but show warning to user |
| `log` | Allow silently, log for audit |

### 9. CLI Flags

```bash
# Override sandbox mode
amplifier run -p "task" --sandbox strict
amplifier run -p "task" --sandbox off

# Use sandbox profile
amplifier run -p "task" --sandbox-profile development

# Quick disable (dangerous)
amplifier run -p "task" --no-sandbox
```

### 10. Slash Commands

```
/sandbox                 # Show current sandbox status
/sandbox status          # Same as above
/sandbox test <cmd>      # Test command against sandbox policy
/sandbox violations      # Show recent violations
```

### 11. Audit Logging

```yaml
sandbox:
  audit:
    enabled: true
    log_file: "~/.amplifier/sandbox-audit.log"
    log_allowed: false     # Log allowed operations too?
    log_denied: true       # Log denied operations
```

Audit log format:
```json
{
    "timestamp": "2024-01-15T10:30:00Z",
    "session_id": "abc123",
    "command": "cat ~/.ssh/id_rsa",
    "operation": "file-read",
    "path": "/Users/user/.ssh/id_rsa",
    "action": "denied",
    "rule": "filesystem.deny: ${HOME}/.ssh"
}
```

## Acceptance Criteria

- [ ] macOS sandbox-exec backend working
- [ ] Linux bubblewrap backend working
- [ ] Windows fallback (Job Objects or warning)
- [ ] Sandbox modes: off, permissive, standard, strict
- [ ] Filesystem allow/deny rules work
- [ ] Network allow/deny rules work
- [ ] Resource limits enforced
- [ ] Predefined profiles available
- [ ] `--sandbox` CLI flag works
- [ ] `/sandbox` slash commands work
- [ ] Violations logged in permissive mode
- [ ] Audit logging for security review
- [ ] Graceful fallback on unsupported platforms
- [ ] Unit tests for policy generation
- [ ] Integration tests for sandboxed execution

## Related
- Depends on: None (but enhances Permission System)
- Enhances: Security for untrusted prompts

## Estimated Effort
**High** - 3-4 weeks (platform-specific work)

## Files to Create/Modify
- `src/amplifier_app_cli/sandbox/` (new module)
- `src/amplifier_app_cli/lib/tools.py` (integrate with Bash tool)
- `src/amplifier_app_cli/commands/run.py` (--sandbox flag)
- `src/amplifier_app_cli/commands/slash.py` (/sandbox commands)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Low] OS-Level Sandboxing #26

Summary

Current State

Proposed Implementation

1. Sandbox Modes

2. Platform-Specific Backends

macOS: `sandbox-exec`

Linux: `bubblewrap` (bwrap)

Windows: Job Objects + Restricted Tokens

3. Sandbox Configuration

4. Sandbox Profiles

5. New Module Structure

6. Core Interfaces

7. Integration with Bash Tool

8. Violation Handling

9. CLI Flags

10. Slash Commands

11. Audit Logging

Acceptance Criteria

Related

Estimated Effort

Files to Create/Modify

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mode	Description	Use Case
`off`	No sandboxing (current behavior)	Trusted, interactive use
`permissive`	Log violations but allow	Development, auditing
`standard`	Block dangerous operations	Normal use
`strict`	Minimal access only	Untrusted prompts

Profile	Filesystem	Network	Resources
`minimal`	Project only, read-only	None	Tight limits
`development`	Project R/W, system read	Full	Normal limits
`ci`	Project R/W	Limited	Normal limits
`untrusted`	Project read, temp write	None	Tight limits

Action	Behavior
`block`	Deny operation, return error
`warn`	Allow but show warning to user
`log`	Allow silently, log for audit

[Low] OS-Level Sandboxing #26

Description

Summary

Current State

Proposed Implementation

1. Sandbox Modes

2. Platform-Specific Backends

macOS: sandbox-exec

Linux: bubblewrap (bwrap)

Windows: Job Objects + Restricted Tokens

3. Sandbox Configuration

4. Sandbox Profiles

5. New Module Structure

6. Core Interfaces

7. Integration with Bash Tool

8. Violation Handling

9. CLI Flags

10. Slash Commands

11. Audit Logging

Acceptance Criteria

Related

Estimated Effort

Files to Create/Modify

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

macOS: `sandbox-exec`

Linux: `bubblewrap` (bwrap)