Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

SSHX Documentation

sshx is a cross-platform SSH and SFTP command-line client for people and automation agents who operate many remote servers. It keeps the tool shape simple: one command opens one SSH session, does the requested work, writes an optional local audit event, and exits.

The documentation starts in English by default. Use the language switch in the top navigation bar to open the matching Chinese page.

What SSHX Is Good At

  • Run a remote command with predictable stdout, stderr, and exit-code behavior.
  • Save sudo passwords in the operating system keyring instead of plaintext files.
  • Use short host names from ~/.sshx/settings.json instead of repeating IP, port, user, and key paths.
  • Perform small SFTP tasks without opening an interactive client.
  • Produce JSON output that scripts and AI agents can branch on.
  • Preview local execution plans with --dry-run before connecting, reading secrets, mutating known_hosts, or writing host config.
  • Keep a local JSONL audit trail without recording plaintext passwords, private keys, stdout, or stderr.

Mental Model

Think of sshx as a safer one-shot remote operation helper, not a shell replacement and not a remote orchestration platform.

human, script, or agent
        |
        v
sshx CLI flags and optional .env
        |
        v
named host resolution and safety checks
        |
        v
SSH command or SFTP action
        |
        v
structured result, exit code, optional audit event

Common First Commands

# See available flags and examples
sshx --help

# Run a simple command
sshx -h=192.168.1.100 -u=root "uptime"

# Run against a named host
sshx -h=prod-web "systemctl is-active nginx"

# Preview what would happen before connecting
sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"

# Get machine-readable output for automation
sshx -h=prod-web --json "systemctl is-active nginx"

Safety First

Remote access tools can cause real damage. The safe default path in sshx is strict:

  • Host keys are checked through known_hosts.
  • Passwords belong in the OS keyring, not in shell history or config files.
  • Sudo passwords are sent through stdin, never interpolated into the command string.
  • Obvious destructive commands are blocked unless the user explicitly bypasses checks.
  • Safety checks are guardrails against mistakes; they are not a sandbox for untrusted commands.

Read Security Guidelines before using sshx in production or agent-driven workflows.

Where To Go Next

Getting Started

This guide walks through a first safe setup. The examples use prod-web as the host name; replace it with your own server name or IP address.

Install

If Go is already installed:

go install github.com/talkincode/sshx/cmd/sshx@latest
sshx --version

You can also run a specific version without installing:

go run github.com/talkincode/sshx/cmd/sshx@latest --help

Verify SSH Trust First

sshx checks host keys by default. Add the server to known_hosts before the first connection:

ssh-keyscan -H prod-web >> ~/.ssh/known_hosts

If you deliberately want first-use trust, use:

sshx --accept-unknown-host -h=prod-web "uptime"

Avoid --insecure-hostkey except in short-lived controlled labs. It disables the trust check that protects against man-in-the-middle attacks.

Run The First Command

sshx -h=prod-web -u=deploy "uptime"

Useful variations:

# Non-standard SSH port
sshx -h=prod-web -p=2222 -u=deploy "uptime"

# Specific SSH key
sshx -h=prod-web -u=deploy -i=~/.ssh/prod-web.pem "uptime"

# Bound a slow command
sshx -h=prod-web --timeout=30s "apt-get update"

Add A Named Host

Named hosts keep connection details in one local file:

sshx --host-add --host-name=prod-web -h=192.168.1.100 -u=deploy -i=~/.ssh/prod-web.pem --host-desc="Production web node"

After that:

sshx --host-list
sshx --host-test=prod-web
sshx -h=prod-web "uname -a"

The settings file is ~/.sshx/settings.json and is written with 0600 permissions.

Save A Sudo Password

For commands that start with sudo, sshx can read a password from the OS keyring and feed it to sudo through stdin.

sshx --password-set=prod-web-sudo
sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl status nginx"

Use interactive input. Avoid inline values such as --password-set=key:password; they can leak through shell history or process lists.

Preview Before Running

--dry-run explains how sshx interpreted the command without connecting, executing, reading keyring secrets, changing known_hosts, or writing host config.

sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"

Dry-run proves the local plan. It does not prove the remote command would succeed.

Host Management

Named hosts turn repetitive SSH details into a short, readable alias. They are useful when you manage several servers, when each server uses a different SSH key, or when an agent needs a stable host inventory.

Add Hosts

Interactive setup:

sshx --host-add

Command-line setup:

sshx --host-add \
  --host-name=prod-web \
  -h=192.168.1.100 \
  -p=22 \
  -u=deploy \
  -i=~/.ssh/prod-web.pem \
  -pk=prod-web-sudo \
  --host-desc="Production web node" \
  --host-type=linux

Then run commands by alias:

sshx -h=prod-web "hostname && uptime"

Settings File

Host definitions live in ~/.sshx/settings.json.

{
  "key": "/Users/alice/.ssh/id_rsa",
  "hosts": [
    {
      "name": "prod-web",
      "description": "Production web node",
      "host": "192.168.1.100",
      "port": "22",
      "user": "deploy",
      "key": "/Users/alice/.ssh/prod-web.pem",
      "password_key": "prod-web-sudo",
      "type": "linux"
    }
  ]
}

The top-level key is the default SSH private key. A per-host key overrides it for that host only.

Daily Host Commands

# List configured hosts
sshx --host-list

# Test one host
sshx --host-test=prod-web

# Test every host with a per-host dial timeout
sshx --host-test-all

# Update a host
sshx --host-update --host-name=prod-web -u=deploy -i=~/.ssh/prod-web-2026.pem

# Remove a host
sshx --host-remove=old-lab

Practical Naming Patterns

Use names that explain both role and environment:

prod-web-1
prod-db-primary
staging-api
lab-router
customer-a-jump

Use password keys that do not expose sensitive topology in public logs. For shared runbooks, prefer placeholders:

sshx -h=prod-web -pk=<sudo-key> "sudo systemctl reload nginx"

Team And Agent Use

For human operators, named hosts reduce typing errors. For automation agents, they create a stable boundary:

  • The agent receives prod-web, not a raw IP and key path.
  • The operator can review ~/.sshx/settings.json.
  • --dry-run --json can confirm which address, port, user, key, and sudo key would be used.
  • Audit events can record the resolved host without storing secrets.

SFTP Workflows

sshx supports one-shot SFTP actions for common file tasks. It is not an interactive file manager; each invocation performs one clear upload, download, list, mkdir, or remove operation.

Upload A File

sshx -h=prod-web --upload=./deploy/nginx.conf --to=/tmp/nginx.conf

Safe production pattern:

# Upload to a temporary path first
sshx -h=prod-web --upload=./deploy/nginx.conf --to=/tmp/nginx.conf

# Inspect the uploaded file before moving it into place
sshx -h=prod-web "sudo install -m 0644 /tmp/nginx.conf /etc/nginx/nginx.conf"
sshx -h=prod-web "sudo nginx -t"
sshx -h=prod-web "sudo systemctl reload nginx"

Download A File

sshx -h=prod-web --download=/var/log/nginx/error.log --to=./error.log

Incident collection example:

mkdir -p incident-2026-07-01/prod-web
sshx -h=prod-web --download=/var/log/nginx/error.log --to=incident-2026-07-01/prod-web/error.log
sshx -h=prod-web --download=/etc/os-release --to=incident-2026-07-01/prod-web/os-release

List And Create Directories

sshx -h=prod-web --list=/var/log
sshx -h=prod-web --mkdir=/tmp/sshx-upload

Remove A Remote File

sshx -h=prod-web --rm=/tmp/old-upload.txt

Treat remote deletion as production change. Prefer listing the parent directory first:

sshx -h=prod-web --list=/tmp
sshx -h=prod-web --rm=/tmp/old-upload.txt

Path Boundary

Local paths use your local operating system rules. Remote paths are SFTP paths and should be written as slash-separated remote paths, even when sshx is run from Windows.

# Local Windows path, remote POSIX path
sshx -h=prod-web --upload=C:\Users\alice\release.zip --to=/tmp/release.zip

When To Use Plain SSH Instead

Use an SSH command when the operation needs remote validation or privilege changes:

sshx -h=prod-web "sudo ls -l /etc/nginx"
sshx -h=prod-web "sudo install -m 0644 /tmp/nginx.conf /etc/nginx/nginx.conf"

Use SFTP for direct file movement. Use remote commands for checks, ownership changes, service reloads, and cleanup that requires sudo.

Agent And Script Mode

sshx is designed to be called by scripts and AI agents. The contract is intentionally simple: predictable streams, predictable exit codes, optional JSON, and optional local audit events.

Default Stream Behavior

By default sshx does not request a PTY. That keeps stdout and stderr separate and avoids terminal control characters in script output.

sshx -h=prod-web "systemctl is-active nginx"

The remote command exit code becomes the sshx process exit code when the remote command runs.

Exit Codes

CodeMeaning
0Remote command succeeded.
1..254Remote command failed with that exit code.
255sshx failed before or around execution, such as connect, auth, host-key, timeout, blocked command, config, or other local error.

In JSON mode, sshx-level failures use exit_code: -1 and a non-empty error_kind, so automation can distinguish them from a remote command that exits 255.

JSON Output

sshx -h=prod-web --json "systemctl is-active nginx"

Example shape:

{
  "host": "192.168.1.100",
  "port": "22",
  "user": "deploy",
  "command": "systemctl is-active nginx",
  "exit_code": 0,
  "success": true,
  "stdout": "active\n",
  "stderr": "",
  "duration_ms": 142,
  "auth_method": "key"
}

Agent branching example:

result="$(sshx -h=prod-web --json "systemctl is-active nginx")"
if printf '%s' "$result" | jq -e '.success == true' >/dev/null; then
  echo "nginx is active"
else
  printf '%s\n' "$result" | jq '{exit_code, error_kind, stderr}'
fi

Dry-Run For Change Review

Before a script performs a privileged operation, ask for the plan:

sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"

Use dry-run to verify host resolution, selected sudo key, safety status, and whether the command would mutate state. Do not treat it as proof that the remote service can restart successfully.

Timeouts

Always set timeouts for unattended workflows:

sshx -h=prod-web --timeout=30s --json "systemctl is-active nginx"
sshx -h=prod-web --timeout=2m --json "sudo apt-get update"

Audit Events

Non-dry-run invocations write local JSONL audit events by default:

~/.sshx/audit/sshx-YYYY-MM-DD.jsonl

Store audit events next to a project or incident directory:

sshx -h=prod-web --audit-output=./.sshx-audit "systemctl reload nginx"

Audit events are for provenance. They record metadata and outcomes, but they do not record plaintext passwords, private key contents, stdout, or stderr.

PTY Is Explicit

Some commands need terminal behavior:

sshx -h=prod-web --pty "top -b -n1"

Do not combine --pty with --json. A PTY merges stderr into stdout and makes structured automation less reliable.

Usage Scenarios

This page is intentionally example-heavy. Treat the host names as placeholders and adapt the commands to your own runbooks.

Scenario 1: First Health Check

You just received access to a server and want a low-risk check.

ssh-keyscan -H prod-web >> ~/.ssh/known_hosts
sshx -h=prod-web -u=deploy "hostname && uptime && whoami"

Why this is useful: it verifies host trust, authentication, the remote user, and basic reachability without changing the server.

Scenario 2: Add A Production Host Once

sshx --host-add \
  --host-name=prod-web \
  -h=192.168.1.100 \
  -u=deploy \
  -i=~/.ssh/prod-web.pem \
  -pk=prod-web-sudo \
  --host-desc="Production web node"

sshx --host-test=prod-web
sshx -h=prod-web "hostname"

Why this is useful: future commands no longer repeat IP, user, key path, and sudo key.

Scenario 3: Check A Service Without Changing It

sshx -h=prod-web "systemctl is-active nginx"
sshx -h=prod-web "systemctl status nginx --no-pager"

For automation:

sshx -h=prod-web --json "systemctl is-active nginx"

Scenario 4: Restart A Service With Review

sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"
sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl restart nginx"
sshx -h=prod-web "systemctl is-active nginx"

Why this is useful: the dry-run confirms local interpretation before a privileged change.

Scenario 5: Check Disk Pressure On Several Servers

for host in prod-web prod-api prod-db; do
  echo "== $host =="
  sshx -h="$host" --timeout=15s "df -h / /var /data"
done

Agent-friendly version:

for host in prod-web prod-api prod-db; do
  sshx -h="$host" --timeout=15s --json "df -h / /var /data"
done

Scenario 6: Collect Logs For An Incident

mkdir -p incident-2026-07-01/prod-web
sshx -h=prod-web --download=/var/log/nginx/error.log --to=incident-2026-07-01/prod-web/error.log
sshx -h=prod-web --download=/var/log/nginx/access.log --to=incident-2026-07-01/prod-web/access.log
sshx -h=prod-web --audit-output=incident-2026-07-01/audit "journalctl -u nginx --since '30 min ago' --no-pager"

Why this is useful: downloaded evidence and local audit metadata stay next to the incident folder.

Scenario 7: Upload A Config Safely

sshx -h=prod-web --upload=./nginx.conf --to=/tmp/nginx.conf
sshx -h=prod-web "sudo nginx -t -c /tmp/nginx.conf"
sshx -h=prod-web "sudo install -m 0644 /tmp/nginx.conf /etc/nginx/nginx.conf"
sshx -h=prod-web "sudo nginx -t"
sshx -h=prod-web "sudo systemctl reload nginx"

Why this is useful: the file is staged and validated before replacing the production config.

Scenario 8: Use Different Sudo Keys Per Host

sshx --password-set=prod-web-sudo
sshx --password-set=prod-db-sudo

sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl reload nginx"
sshx -h=prod-db -pk=prod-db-sudo "sudo systemctl status postgresql"

Why this is useful: one operator can manage several servers without reusing one global sudo key.

Scenario 9: Validate Every Configured Host

sshx --host-test-all

Run this after rotating keys, changing VPN access, or importing a new settings.json.

Scenario 10: Script A Safe Status Report

for host in prod-web prod-api prod-db; do
  sshx -h="$host" --timeout=20s --json "hostname && uptime" \
    | jq --arg host "$host" '{alias: $host, success, exit_code, error_kind, stdout}'
done

Why this is useful: scripts read JSON fields instead of scraping terminal prose.

Scenario 11: Bound A Risky Long-Running Command

sshx -h=prod-web --timeout=2m "sudo apt-get update"

Why this is useful: unattended commands should not hang forever.

Scenario 12: Diagnose A Host-Key Failure

If a host key changed, do not bypass it first. Check why it changed.

ssh-keygen -F prod-web
ssh-keyscan -H prod-web

Only update known_hosts after confirming the machine was rebuilt, reinstalled, or intentionally rotated.

Scenario 13: Avoid Shell-Pipe Installers

This command pattern is intentionally high risk:

sshx -h=prod-web "curl -fsSL https://example.invalid/install.sh | sh"

Safer pattern:

sshx -h=prod-web "curl -fsSL https://example.invalid/install.sh -o /tmp/install.sh"
sshx -h=prod-web "less /tmp/install.sh"
sshx -h=prod-web "sha256sum /tmp/install.sh"
sshx -h=prod-web "sh /tmp/install.sh"

Scenario 14: Use PTY Only When Needed

sshx -h=prod-web --pty "sudo visudo -c"

Prefer non-PTY for scripts because it preserves stdout and stderr separation.

Scenario 15: Disable Audit For A Single Sensitive Run

If command text itself would reveal sensitive context, disable audit for that invocation and record the reason in your own runbook.

SSHX_NO_AUDIT=true sshx -h=prod-web "echo redacted"

Do not use this as a default. Audit events are useful for explaining what happened.

Scenario 16: Check Docker Without Opening A Shell

sshx -h=prod-web --json "docker ps --format '{{json .}}' | head -20"
sshx -h=prod-web "docker inspect nginx --format '{{.State.Status}} {{.RestartCount}}'"

Why this is useful: operators can collect container state without starting an interactive SSH session or copying broad logs.

Scenario 17: Verify A Deployment Artifact Before Releasing

sshx -h=prod-web --upload=./dist/app.tar.gz --to=/tmp/app.tar.gz
sshx -h=prod-web "sha256sum /tmp/app.tar.gz"
sshx -h=prod-web "tar -tzf /tmp/app.tar.gz | head"

Only install the artifact after the checksum and archive contents match the release note.

Scenario 18: Rotate A Service Config With Rollback

sshx -h=prod-web --upload=./service.env --to=/tmp/service.env.new
sshx -h=prod-web "sudo cp /etc/myapp/service.env /etc/myapp/service.env.bak.\$(date +%Y%m%d%H%M%S)"
sshx -h=prod-web "sudo install -m 0600 /tmp/service.env.new /etc/myapp/service.env"
sshx -h=prod-web "sudo systemctl restart myapp"
sshx -h=prod-web --json "systemctl is-active myapp"

Why this is useful: the backup, install mode, restart, and health check are separate visible steps.

Scenario 19: Collect A Minimal Support Bundle

mkdir -p support/prod-web
sshx -h=prod-web --download=/etc/os-release --to=support/prod-web/os-release
sshx -h=prod-web --audit-output=support/audit "uname -a"
sshx -h=prod-web --audit-output=support/audit "df -h"
sshx -h=prod-web --audit-output=support/audit "free -m"

Do not download private application data unless the support case explicitly needs it.

Scenario 20: Use -- When Remote Flags Look Like Local Flags

sshx -h=prod-web -- docker run --rm alpine:3.20 sh -c 'echo hello'
sshx -h=prod-web -- echo --force belongs-to-the-remote-command

Why this is useful: -- makes the boundary between local sshx flags and remote command arguments obvious.

Scenario 21: Test A New Host Entry Before Sharing It

sshx --host-add --host-name=staging-api -h=10.0.8.21 -u=deploy -i=~/.ssh/staging.pem -pk=staging-api-sudo
sshx --host-test=staging-api
sshx -h=staging-api --dry-run --json "sudo systemctl reload api"

Only commit or share a runbook after the named host resolves, authenticates, and selects the expected sudo key.

Scenario 22: Keep A Migration Run Bounded

sshx -h=prod-db --timeout=10s --json "pg_isready"
sshx -h=prod-db --timeout=5m --dry-run --json "sudo systemctl restart postgresql"
sshx -h=prod-db --timeout=5m -pk=prod-db-sudo "sudo systemctl restart postgresql"
sshx -h=prod-db --timeout=30s --json "pg_isready"

Why this is useful: every step has a time budget and a machine-readable result.

Scenario 23: Remove A Temporary File With Evidence

sshx -h=prod-web --list=/tmp
sshx -h=prod-web --rm=/tmp/app.tar.gz
sshx -h=prod-web --list=/tmp

Deletion should be visible before and after. For high-risk paths, prefer a remote mv into a dated quarantine directory before permanent removal.

Scenario 24: Fail Closed In CI

result="$(sshx -h=prod-web --timeout=20s --json "systemctl is-active nginx")"
printf '%s\n' "$result" | jq .
printf '%s\n' "$result" | jq -e '.success == true and .stdout == "active\n"'

Why this is useful: CI fails when the structured result is missing, the command fails, or the service state is not exactly what the runbook expects.

Security Guidelines

Remote execution is high impact. These rules are strict because a small mistake can change production systems, leak credentials, or hide the real cause of an incident.

Non-Negotiable Rules

  1. Keep host-key verification strict.
  2. Store passwords in the OS keyring, not in files, shell history, tickets, or chat.
  3. Send sudo passwords through stdin only; never place them in command strings.
  4. Treat --force, --no-safety-check, and --insecure-hostkey as exceptional break-glass choices.
  5. Use --dry-run before privileged or destructive operations.
  6. Use --json and explicit exit-code checks for automation.
  7. Remember that command safety checks are not a sandbox.

Production Policy

For production, shared runbooks, CI jobs, and agent-driven operations, treat these as policy instead of suggestions:

  • Use named hosts so reviewers can see the intended target.
  • Set --timeout on every unattended command.
  • Use --audit-output for project, migration, release, and incident work.
  • Require --dry-run --json before a privileged mutation.
  • Keep --force and --no-safety-check out of reusable scripts.
  • Keep --insecure-hostkey out of reusable scripts and CI.
  • Do not run commands copied from chat, tickets, or web pages until they are reviewed against the target host and rollback plan.
  • Prefer staged file writes: upload to /tmp, validate, then install with explicit mode and ownership.
  • Prefer one visible step per irreversible action; avoid chaining many privileged changes with &&.
  • Record the maintenance window, operator, command, result, and rollback decision in your own runbook when the action affects production.

Never make a weak security flag global through shell profiles, CI variables, or shared .env files. A break-glass override must be local to one command and easy to remove.

Host-Key Trust

Default behavior protects against unknown or changed host keys. Use one of these safe paths:

# Recommended: explicitly add the host key after reviewing the target
ssh-keyscan -H prod-web >> ~/.ssh/known_hosts

# Accept first use for a known controlled host
sshx --accept-unknown-host -h=prod-web "uptime"

Avoid:

sshx --insecure-hostkey -h=prod-web "uptime"

Use insecure host-key mode only in short-lived controlled labs where the risk is understood and recorded. Never make it the default in scripts or shared runbooks.

Secret Handling

Use interactive keyring storage:

sshx --password-set=prod-web-sudo

Avoid inline secrets:

sshx --password-set=prod-web-sudo:plain-text-password

Inline values can leak through shell history, terminal scrollback, process listings, logs, or copied commands.

Keyring password keys are for sudo auto-fill. SSH_PASSWORD is an SSH login password and should be treated as a high-risk fallback, not a normal operating mode.

Sudo Rules

sshx auto-fills sudo only when the remote command starts with sudo:

sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl reload nginx"

These commands do not trigger sudo auto-fill:

sshx -h=prod-web "sh -c 'sudo whoami'"
sshx -h=prod-web "echo sudo"

This boundary keeps password lookup, stdin injection, and audit metadata aligned to one clear rule.

Safety Checks Are Guardrails

sshx blocks common destructive patterns such as root deletion, disk formatting, shutdown or reboot commands, critical system file edits, fork bombs, and curl | sh style pipelines.

That does not make untrusted commands safe. A command validator cannot understand every script, shell expansion, application-specific migration, or data-destruction path.

Before bypassing checks:

sshx -h=prod-web --dry-run --json "sudo systemctl reboot"
sshx -h=prod-web --force "sudo systemctl reboot"

Ask:

  • Is the target host correct?
  • Is the command reviewed?
  • Is there a maintenance window?
  • Is rollback possible?
  • Is the bypass reason recorded?

If any answer is “no”, stop and fix the runbook first. --force should mean “I reviewed this exact command for this exact target”, not “make the tool stop complaining”.

Agent And Automation Rules

Automation should be more conservative than a human terminal:

  • Always set --timeout.
  • Prefer --json.
  • Parse success, exit_code, and error_kind.
  • Run --dry-run --json before privileged changes.
  • Do not set SSH_INSECURE_HOST_KEY=1 globally.
  • Do not pass plaintext passwords through environment variables unless there is no safer path and the lifetime is tightly controlled.
  • Store audit events with --audit-output when a run belongs to a project, migration, or incident.

Audit Trail Boundaries

Audit events are local JSONL records for provenance. They record metadata such as mode, action, host resolution, sudo/keyring decisions, safety status, authentication method, exit code, error kind, and duration.

They intentionally do not record:

  • Plaintext passwords.
  • Private key contents.
  • stdout.
  • stderr.

Command text is included for provenance and redacted for common password or token-style arguments, but do not treat redaction as a reason to place secrets in commands.

SFTP Safety

For uploads to privileged paths, stage the file first:

sshx -h=prod-web --upload=./service.conf --to=/tmp/service.conf
sshx -h=prod-web "sudo install -m 0644 /tmp/service.conf /etc/service/service.conf"

For removals, list before deleting:

sshx -h=prod-web --list=/tmp
sshx -h=prod-web --rm=/tmp/old-file

Remote SFTP paths are remote paths. Do not rely on local OS path rules for remote targets.

Incident Response Checklist

When something looks wrong:

  1. Stop retrying with weaker security flags.
  2. Capture the exact command, exit code, and error_kind.
  3. Check audit events under ~/.sshx/audit or the configured --audit-output.
  4. Verify host-key state with ssh-keygen -F <host>.
  5. Check whether the failure happened before SSH, during auth, during safety validation, during command execution, or during output collection.
  6. Rotate exposed credentials if a secret may have entered shell history, CI logs, issue text, or chat.

Good Defaults For Shared Runbooks

sshx -h=<named-host> \
  --timeout=30s \
  --audit-output=./.sshx-audit \
  --dry-run \
  --json \
  "sudo systemctl reload <service>"

Then run the real command only after the plan is reviewed:

sshx -h=<named-host> \
  --timeout=30s \
  --audit-output=./.sshx-audit \
  -pk=<sudo-key> \
  "sudo systemctl reload <service>"

Troubleshooting

Use the failure boundary first: did sshx fail before the remote command ran, or did the remote command run and exit non-zero?

Get Structured Error Details

sshx -h=prod-web --json "systemctl is-active nginx"

Look at:

  • success
  • exit_code
  • error_kind
  • stderr
  • auth_method

An sshx-level failure in JSON mode has exit_code: -1 and a non-empty error_kind.

Host Key Errors

Symptoms:

  • Unknown host key.
  • Changed host key.
  • Connection aborts before authentication.

Checks:

ssh-keygen -F prod-web
ssh-keyscan -H prod-web

Fix only after confirming the host is expected. Do not jump straight to --insecure-hostkey.

Authentication Errors

Check the resolved host and selected key:

sshx -h=prod-web --dry-run --json "whoami"

Common causes:

  • Wrong user in ~/.sshx/settings.json.
  • Wrong per-host key path.
  • Key file has bad permissions.
  • Server does not allow the selected authentication method.
  • You expected keyring sudo password to act as an SSH login password.

Keyring passwords are for sudo auto-fill. They are not silently used as SSH login passwords.

Sudo Does Not Auto-Fill

sshx only auto-fills sudo when the command starts with sudo.

Works:

sshx -h=prod-web -pk=prod-web-sudo "sudo whoami"

Does not trigger auto-fill:

sshx -h=prod-web "sh -c 'sudo whoami'"

Check that the password key exists:

sshx --password-check=prod-web-sudo

A Command Is Blocked

Blocked commands are usually safety-check failures.

sshx -h=prod-web --dry-run --json "sudo rm -rf /"

If a privileged or destructive command is genuinely intended, review it, record the reason, and use --force only for that invocation.

Script Hangs

Set a timeout:

sshx -h=prod-web --timeout=30s --json "long-running-command"

If the command requires terminal behavior, use --pty, but remember that PTY mode is less suitable for structured automation.

JSON Output Is Not Parseable

In normal JSON mode, stdout should contain one JSON object and diagnostics should stay on stderr. Check for these issues:

  • The command was run with --pty.
  • A wrapper script printed extra text around the sshx call.
  • The caller mixed stdout and stderr.

SFTP Path Problems

Use local path rules only for local files. Use slash-separated remote paths for remote targets:

sshx -h=prod-web --upload=./file.txt --to=/tmp/file.txt

Audit Events Are Missing

Check whether audit was disabled:

env | grep SSHX_NO_AUDIT

Check the output location:

ls ~/.sshx/audit

If using a project-local location:

sshx -h=prod-web --audit-output=./.sshx-audit "uptime"
ls ./.sshx-audit

Command Not Found

Check installation:

command -v sshx
sshx --version

If installed with Go, confirm ~/go/bin or your GOPATH/bin is in PATH.

SSHX 文档

sshx 是一个跨平台的 SSH/SFTP 命令行客户端,面向经常操作多台远程服务器的人和自动化 agent。它保持一个很简单的模型:一次命令建立一次 SSH 会话,完成指定操作,按需写入本地审计事件,然后退出。

文档默认首页是英文。可以使用顶部导航栏里的语言切换入口打开对应中文页面。

SSHX 擅长什么

  • 用稳定的 stdout、stderr 和退出码执行远程命令。
  • 把 sudo 密码保存到操作系统密钥链,而不是明文文件。
  • ~/.sshx/settings.json 里的主机短名称代替重复输入 IP、端口、用户和 key 路径。
  • 不打开交互式 SFTP 客户端,也能完成常见文件上传、下载和目录操作。
  • 输出适合脚本和 AI agent 判断分支的 JSON。
  • --dry-run 在连接、读取 secret、修改 known_hosts 或写配置前预览本地执行计划。
  • 写入本地 JSONL 审计日志,同时不记录明文密码、私钥、stdout 或 stderr。

心智模型

sshx 理解成一个更安全的一次性远程操作助手,而不是交互式 shell 的替代品,也不是远程编排平台。

人类、脚本或 agent
        |
        v
sshx CLI 参数与可选 .env
        |
        v
命名主机解析与安全检查
        |
        v
SSH 命令或 SFTP 操作
        |
        v
结构化结果、退出码、可选审计事件

最常用的第一组命令

# 查看参数和示例
sshx --help

# 执行简单命令
sshx -h=192.168.1.100 -u=root "uptime"

# 使用命名主机
sshx -h=prod-web "systemctl is-active nginx"

# 连接前预览执行计划
sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"

# 给自动化输出机器可读结果
sshx -h=prod-web --json "systemctl is-active nginx"

安全优先

远程操作工具可能造成真实破坏。sshx 的默认安全路径是严格的:

  • 通过 known_hosts 校验主机密钥。
  • 密码应进入 OS keyring,而不是 shell history 或配置文件。
  • sudo 密码通过 stdin 传入,绝不拼进命令字符串。
  • 明显危险的破坏性命令默认会被阻止,除非用户显式绕过。
  • 安全检查只是防误操作护栏,不是不可信命令的沙箱。

在生产环境或 agent 驱动工作流中使用前,请先阅读安全准则

下一步

快速开始

这份指南走一遍安全的首次配置。示例里使用 prod-web 作为主机名,请替换成你自己的服务器名称或 IP。

安装

如果已经安装 Go:

go install github.com/talkincode/sshx/cmd/sshx@latest
sshx --version

也可以不安装,直接运行指定版本:

go run github.com/talkincode/sshx/cmd/sshx@latest --help

先确认 SSH 信任

sshx 默认校验 host key。首次连接前建议先把服务器写入 known_hosts

ssh-keyscan -H prod-web >> ~/.ssh/known_hosts

如果你明确接受首次连接信任,可以使用:

sshx --accept-unknown-host -h=prod-web "uptime"

除短期受控实验环境外,不要使用 --insecure-hostkey。它会关闭防中间人攻击的主机信任校验。

执行第一条命令

sshx -h=prod-web -u=deploy "uptime"

常见变体:

# 非标准 SSH 端口
sshx -h=prod-web -p=2222 -u=deploy "uptime"

# 指定 SSH key
sshx -h=prod-web -u=deploy -i=~/.ssh/prod-web.pem "uptime"

# 给慢命令设置上限
sshx -h=prod-web --timeout=30s "apt-get update"

添加命名主机

命名主机把连接信息集中保存在一个本地文件里:

sshx --host-add --host-name=prod-web -h=192.168.1.100 -u=deploy -i=~/.ssh/prod-web.pem --host-desc="Production web node"

之后可以这样使用:

sshx --host-list
sshx --host-test=prod-web
sshx -h=prod-web "uname -a"

配置文件是 ~/.sshx/settings.json,写入权限为 0600

保存 sudo 密码

对于以 sudo 开头的命令,sshx 可以从 OS keyring 读取密码,并通过 stdin 传给 sudo。

sshx --password-set=prod-web-sudo
sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl status nginx"

建议使用交互式输入。不要使用 --password-set=key:password 这种内联值,它可能泄露到 shell history 或进程列表。

执行前预览

--dry-run 会说明 sshx 如何解释这条命令,但不会连接、执行、读取 keyring secret、修改 known_hosts 或写主机配置。

sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"

dry-run 证明本地执行计划,不证明远程命令一定会成功。

主机管理

命名主机把重复的 SSH 信息变成短名称。它适合管理多台服务器、每台服务器使用不同 SSH key、或者 agent 需要稳定主机清单的场景。

添加主机

交互式添加:

sshx --host-add

命令行添加:

sshx --host-add \
  --host-name=prod-web \
  -h=192.168.1.100 \
  -p=22 \
  -u=deploy \
  -i=~/.ssh/prod-web.pem \
  -pk=prod-web-sudo \
  --host-desc="Production web node" \
  --host-type=linux

之后用别名执行命令:

sshx -h=prod-web "hostname && uptime"

配置文件

主机定义保存在 ~/.sshx/settings.json

{
  "key": "/Users/alice/.ssh/id_rsa",
  "hosts": [
    {
      "name": "prod-web",
      "description": "Production web node",
      "host": "192.168.1.100",
      "port": "22",
      "user": "deploy",
      "key": "/Users/alice/.ssh/prod-web.pem",
      "password_key": "prod-web-sudo",
      "type": "linux"
    }
  ]
}

顶层 key 是默认 SSH 私钥。单个 host 的 key 只覆盖这一台主机。

日常主机命令

# 列出已配置主机
sshx --host-list

# 测试单台主机
sshx --host-test=prod-web

# 测试所有主机,每台使用独立拨号超时
sshx --host-test-all

# 更新主机
sshx --host-update --host-name=prod-web -u=deploy -i=~/.ssh/prod-web-2026.pem

# 删除主机
sshx --host-remove=old-lab

实用命名方式

主机名最好同时说明环境和角色:

prod-web-1
prod-db-primary
staging-api
lab-router
customer-a-jump

password key 不要暴露敏感拓扑。共享 runbook 中尽量使用占位符:

sshx -h=prod-web -pk=<sudo-key> "sudo systemctl reload nginx"

团队和 agent 使用

对人类来说,命名主机减少输入错误。对自动化 agent 来说,它提供稳定边界:

  • agent 收到的是 prod-web,不是裸 IP 和 key 路径。
  • 操作者可以审阅 ~/.sshx/settings.json
  • --dry-run --json 可以确认真实会使用哪个地址、端口、用户、key 和 sudo key。
  • 审计事件可以记录解析后的主机,但不保存 secret。

SFTP 工作流

sshx 支持常见的一次性 SFTP 操作。它不是交互式文件管理器;每次调用只做一个明确的上传、下载、列目录、创建目录或删除操作。

上传文件

sshx -h=prod-web --upload=./deploy/nginx.conf --to=/tmp/nginx.conf

生产环境更安全的模式:

# 先上传到临时路径
sshx -h=prod-web --upload=./deploy/nginx.conf --to=/tmp/nginx.conf

# 检查后再移动到正式位置
sshx -h=prod-web "sudo install -m 0644 /tmp/nginx.conf /etc/nginx/nginx.conf"
sshx -h=prod-web "sudo nginx -t"
sshx -h=prod-web "sudo systemctl reload nginx"

下载文件

sshx -h=prod-web --download=/var/log/nginx/error.log --to=./error.log

事故材料采集示例:

mkdir -p incident-2026-07-01/prod-web
sshx -h=prod-web --download=/var/log/nginx/error.log --to=incident-2026-07-01/prod-web/error.log
sshx -h=prod-web --download=/etc/os-release --to=incident-2026-07-01/prod-web/os-release

列目录与创建目录

sshx -h=prod-web --list=/var/log
sshx -h=prod-web --mkdir=/tmp/sshx-upload

删除远程文件

sshx -h=prod-web --rm=/tmp/old-upload.txt

把远程删除当成生产变更。建议先列出父目录:

sshx -h=prod-web --list=/tmp
sshx -h=prod-web --rm=/tmp/old-upload.txt

路径边界

本地路径遵循本地操作系统规则。远程路径是 SFTP 路径,应使用斜杠分隔;即使 sshx 在 Windows 上运行也一样。

# 本地 Windows 路径,远程 POSIX 路径
sshx -h=prod-web --upload=C:\Users\alice\release.zip --to=/tmp/release.zip

什么时候改用 SSH 命令

当操作需要远程校验或权限变更时,使用 SSH 命令:

sshx -h=prod-web "sudo ls -l /etc/nginx"
sshx -h=prod-web "sudo install -m 0644 /tmp/nginx.conf /etc/nginx/nginx.conf"

SFTP 负责文件移动。远程命令负责检查、改属主、reload 服务和需要 sudo 的清理。

Agent 与脚本模式

sshx 设计上可以被脚本和 AI agent 调用。契约很简单:稳定的 stdout/stderr、稳定退出码、可选 JSON、可选本地审计事件。

默认输出流

默认不请求 PTY,这样 stdout 和 stderr 会保持分离,也不会把终端控制字符混进脚本输出。

sshx -h=prod-web "systemctl is-active nginx"

当远程命令成功运行后,远程退出码会成为 sshx 进程退出码。

退出码

退出码含义
0远程命令成功。
1..254远程命令以该退出码失败。
255sshx 层面失败,例如连接、认证、host-key、timeout、命令被阻止、配置或其他本地错误。

在 JSON 模式下,sshx 层面的失败使用 exit_code: -1 和非空 error_kind,因此自动化可以把它和远程命令退出 255 区分开。

JSON 输出

sshx -h=prod-web --json "systemctl is-active nginx"

示例结构:

{
  "host": "192.168.1.100",
  "port": "22",
  "user": "deploy",
  "command": "systemctl is-active nginx",
  "exit_code": 0,
  "success": true,
  "stdout": "active\n",
  "stderr": "",
  "duration_ms": 142,
  "auth_method": "key"
}

agent 分支示例:

result="$(sshx -h=prod-web --json "systemctl is-active nginx")"
if printf '%s' "$result" | jq -e '.success == true' >/dev/null; then
  echo "nginx is active"
else
  printf '%s\n' "$result" | jq '{exit_code, error_kind, stderr}'
fi

用 dry-run 审核变更

在脚本执行特权操作前,先看计划:

sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"

用 dry-run 核对主机解析、sudo key、安全检查结果,以及真实执行是否会修改状态。不要把 dry-run 当成远程服务一定能重启成功的证明。

超时

无人值守工作流应总是设置 timeout:

sshx -h=prod-web --timeout=30s --json "systemctl is-active nginx"
sshx -h=prod-web --timeout=2m --json "sudo apt-get update"

审计事件

非 dry-run 调用默认写入本地 JSONL 审计事件:

~/.sshx/audit/sshx-YYYY-MM-DD.jsonl

把审计事件保存到项目或事故目录旁边:

sshx -h=prod-web --audit-output=./.sshx-audit "systemctl reload nginx"

审计事件用于溯源。它记录元数据和结果,但不记录明文密码、私钥内容、stdout 或 stderr。

PTY 需要显式启用

某些命令需要终端语义:

sshx -h=prod-web --pty "top -b -n1"

不要把 --pty--json 混用。PTY 会把 stderr 合并进 stdout,让结构化自动化变得不稳定。

使用场景

这一页故意放了很多例子。请把主机名当成占位符,并按你的 runbook 调整命令。

场景 1:第一次健康检查

刚拿到服务器访问权限,先做低风险检查:

ssh-keyscan -H prod-web >> ~/.ssh/known_hosts
sshx -h=prod-web -u=deploy "hostname && uptime && whoami"

这能同时验证 host trust、认证、远程用户和基本连通性,而且不会修改服务器。

场景 2:一次性添加生产主机

sshx --host-add \
  --host-name=prod-web \
  -h=192.168.1.100 \
  -u=deploy \
  -i=~/.ssh/prod-web.pem \
  -pk=prod-web-sudo \
  --host-desc="Production web node"

sshx --host-test=prod-web
sshx -h=prod-web "hostname"

这样后续命令不再重复 IP、用户、key 路径和 sudo key。

场景 3:只检查服务,不做变更

sshx -h=prod-web "systemctl is-active nginx"
sshx -h=prod-web "systemctl status nginx --no-pager"

自动化版本:

sshx -h=prod-web --json "systemctl is-active nginx"

场景 4:带审核地重启服务

sshx -h=prod-web --dry-run --json "sudo systemctl restart nginx"
sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl restart nginx"
sshx -h=prod-web "systemctl is-active nginx"

dry-run 可以在特权变更前确认本地解释是否正确。

场景 5:检查多台机器磁盘压力

for host in prod-web prod-api prod-db; do
  echo "== $host =="
  sshx -h="$host" --timeout=15s "df -h / /var /data"
done

agent 友好版本:

for host in prod-web prod-api prod-db; do
  sshx -h="$host" --timeout=15s --json "df -h / /var /data"
done

场景 6:为事故收集日志

mkdir -p incident-2026-07-01/prod-web
sshx -h=prod-web --download=/var/log/nginx/error.log --to=incident-2026-07-01/prod-web/error.log
sshx -h=prod-web --download=/var/log/nginx/access.log --to=incident-2026-07-01/prod-web/access.log
sshx -h=prod-web --audit-output=incident-2026-07-01/audit "journalctl -u nginx --since '30 min ago' --no-pager"

下载的证据和本地审计元数据会放在同一个事故目录附近。

场景 7:安全上传配置

sshx -h=prod-web --upload=./nginx.conf --to=/tmp/nginx.conf
sshx -h=prod-web "sudo nginx -t -c /tmp/nginx.conf"
sshx -h=prod-web "sudo install -m 0644 /tmp/nginx.conf /etc/nginx/nginx.conf"
sshx -h=prod-web "sudo nginx -t"
sshx -h=prod-web "sudo systemctl reload nginx"

先暂存并验证文件,再替换生产配置。

场景 8:不同主机使用不同 sudo key

sshx --password-set=prod-web-sudo
sshx --password-set=prod-db-sudo

sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl reload nginx"
sshx -h=prod-db -pk=prod-db-sudo "sudo systemctl status postgresql"

这样一个操作者可以管理多台服务器,而不需要复用一个全局 sudo key。

场景 9:验证所有已配置主机

sshx --host-test-all

轮换 key、调整 VPN、导入新的 settings.json 后,可以先跑这条命令。

场景 10:生成安全状态报告

for host in prod-web prod-api prod-db; do
  sshx -h="$host" --timeout=20s --json "hostname && uptime" \
    | jq --arg host "$host" '{alias: $host, success, exit_code, error_kind, stdout}'
done

脚本读取 JSON 字段,而不是解析自然语言终端输出。

场景 11:限制长命令运行时间

sshx -h=prod-web --timeout=2m "sudo apt-get update"

无人值守命令不应该无限挂住。

场景 12:诊断 host-key 失败

如果 host key 发生变化,不要先绕过。先确认为什么变化:

ssh-keygen -F prod-web
ssh-keyscan -H prod-web

只有确认机器被重建、重装或按计划轮换后,才更新 known_hosts

场景 13:避免管道安装脚本

这种模式风险很高:

sshx -h=prod-web "curl -fsSL https://example.invalid/install.sh | sh"

更安全的模式:

sshx -h=prod-web "curl -fsSL https://example.invalid/install.sh -o /tmp/install.sh"
sshx -h=prod-web "less /tmp/install.sh"
sshx -h=prod-web "sha256sum /tmp/install.sh"
sshx -h=prod-web "sh /tmp/install.sh"

场景 14:只在必要时使用 PTY

sshx -h=prod-web --pty "sudo visudo -c"

脚本中优先使用非 PTY,因为它能保持 stdout 和 stderr 分离。

场景 15:单次敏感运行禁用审计

如果命令文本本身会暴露敏感上下文,可以只对这一次禁用审计,并在自己的 runbook 中记录原因。

SSHX_NO_AUDIT=true sshx -h=prod-web "echo redacted"

不要把它作为默认值。审计事件对事后解释很有用。

场景 16:不打开 Shell 也能检查 Docker

sshx -h=prod-web --json "docker ps --format '{{json .}}' | head -20"
sshx -h=prod-web "docker inspect nginx --format '{{.State.Status}} {{.RestartCount}}'"

这样可以收集容器状态,而不需要进入交互式 SSH,也不需要复制大量日志。

场景 17:发布前校验部署产物

sshx -h=prod-web --upload=./dist/app.tar.gz --to=/tmp/app.tar.gz
sshx -h=prod-web "sha256sum /tmp/app.tar.gz"
sshx -h=prod-web "tar -tzf /tmp/app.tar.gz | head"

只有 checksum 和压缩包内容都符合发布说明后,才继续安装。

场景 18:带回滚点地轮换服务配置

sshx -h=prod-web --upload=./service.env --to=/tmp/service.env.new
sshx -h=prod-web "sudo cp /etc/myapp/service.env /etc/myapp/service.env.bak.\$(date +%Y%m%d%H%M%S)"
sshx -h=prod-web "sudo install -m 0600 /tmp/service.env.new /etc/myapp/service.env"
sshx -h=prod-web "sudo systemctl restart myapp"
sshx -h=prod-web --json "systemctl is-active myapp"

备份、权限安装、重启和健康检查是分开的可见步骤,出错时更容易定位。

场景 19:收集最小支持包

mkdir -p support/prod-web
sshx -h=prod-web --download=/etc/os-release --to=support/prod-web/os-release
sshx -h=prod-web --audit-output=support/audit "uname -a"
sshx -h=prod-web --audit-output=support/audit "df -h"
sshx -h=prod-web --audit-output=support/audit "free -m"

除非支持工单明确需要,不要下载应用私有数据。

场景 20:远程参数像本地参数时使用 --

sshx -h=prod-web -- docker run --rm alpine:3.20 sh -c 'echo hello'
sshx -h=prod-web -- echo --force belongs-to-the-remote-command

-- 可以明确区分本地 sshx 参数和远程命令参数,避免把远程参数误当成本地开关。

场景 21:共享前先测试新的主机条目

sshx --host-add --host-name=staging-api -h=10.0.8.21 -u=deploy -i=~/.ssh/staging.pem -pk=staging-api-sudo
sshx --host-test=staging-api
sshx -h=staging-api --dry-run --json "sudo systemctl reload api"

只有命名主机能解析、能认证,并且选择了预期 sudo key 后,才把 runbook 共享出去。

场景 22:给迁移操作设置边界

sshx -h=prod-db --timeout=10s --json "pg_isready"
sshx -h=prod-db --timeout=5m --dry-run --json "sudo systemctl restart postgresql"
sshx -h=prod-db --timeout=5m -pk=prod-db-sudo "sudo systemctl restart postgresql"
sshx -h=prod-db --timeout=30s --json "pg_isready"

每一步都有时间上限,也都有机器可读结果。

场景 23:带证据地删除临时文件

sshx -h=prod-web --list=/tmp
sshx -h=prod-web --rm=/tmp/app.tar.gz
sshx -h=prod-web --list=/tmp

删除前后都应该可见。高风险路径优先把文件移动到带日期的隔离目录,而不是直接永久删除。

场景 24:让 CI 失败时默认关闭

result="$(sshx -h=prod-web --timeout=20s --json "systemctl is-active nginx")"
printf '%s\n' "$result" | jq .
printf '%s\n' "$result" | jq -e '.success == true and .stdout == "active\n"'

当结构化结果缺失、命令失败或服务状态不符合 runbook 预期时,CI 会直接失败。

安全准则

远程执行影响很大。这些规则必须严格执行,因为一个小错误就可能修改生产系统、泄露凭据,或掩盖事故的真实原因。

不可妥协的规则

  1. 保持严格 host-key 校验。
  2. 密码保存到 OS keyring,不放进文件、shell history、工单或聊天记录。
  3. sudo 密码只通过 stdin 传入,绝不拼进命令字符串。
  4. --force--no-safety-check--insecure-hostkey 都是例外的 break-glass 选择。
  5. 对特权或破坏性操作先跑 --dry-run
  6. 自动化使用 --json 和明确的退出码判断。
  7. 记住命令安全检查不是沙箱。

生产环境策略

对生产环境、共享 runbook、CI 作业和 agent 驱动操作,把下面这些当成策略,而不是建议:

  • 使用命名主机,让审阅者能看清目标。
  • 每个无人值守命令都设置 --timeout
  • 项目、迁移、发布和事故操作使用 --audit-output
  • 特权变更前必须先跑 --dry-run --json
  • 不要把 --force--no-safety-check 写进可复用脚本。
  • 不要把 --insecure-hostkey 写进可复用脚本或 CI。
  • 从聊天、工单或网页复制来的命令,必须先结合目标主机和回滚方案审阅。
  • 优先使用分阶段写入:上传到 /tmp,验证后再用明确权限和属主安装。
  • 不可逆动作尽量一条命令一个可见步骤,避免用 && 串起大量特权变更。
  • 影响生产时,在自己的 runbook 中记录维护窗口、操作者、命令、结果和回滚判断。

不要通过 shell profile、CI 变量或共享 .env 把弱安全参数变成全局默认值。break-glass 绕过必须只作用于单次命令,并且容易移除。

Host-Key 信任

默认行为会防止未知或变更的 host key。使用下面的安全路径:

# 推荐:审阅目标后显式加入 host key
ssh-keyscan -H prod-web >> ~/.ssh/known_hosts

# 对受控主机接受首次信任
sshx --accept-unknown-host -h=prod-web "uptime"

避免:

sshx --insecure-hostkey -h=prod-web "uptime"

不安全 host-key 模式只适合短期受控实验环境,并且要明确记录风险。不要把它写进默认脚本或共享 runbook。

Secret 处理

使用交互式 keyring 存储:

sshx --password-set=prod-web-sudo

避免内联 secret:

sshx --password-set=prod-web-sudo:plain-text-password

内联值可能泄露到 shell history、终端滚屏、进程列表、日志或复制出去的命令里。

Keyring password key 用于 sudo 自动填充。SSH_PASSWORD 是 SSH 登录密码,应视为高风险 fallback,而不是正常操作模式。

Sudo 规则

只有远程命令以 sudo 开头时,sshx 才会自动填充 sudo:

sshx -h=prod-web -pk=prod-web-sudo "sudo systemctl reload nginx"

下面这些不会触发自动填充:

sshx -h=prod-web "sh -c 'sudo whoami'"
sshx -h=prod-web "echo sudo"

这个边界让密码查询、stdin 注入和审计字段都遵循同一条清晰规则。

安全检查只是护栏

sshx 会拦截常见破坏性模式,例如删除根目录、格式化磁盘、关机重启、修改关键系统文件、fork bomb 和 curl | sh 这类管道。

这并不代表不可信命令就安全了。命令校验器不可能理解所有脚本、shell 展开、应用迁移和业务数据删除路径。

绕过检查前:

sshx -h=prod-web --dry-run --json "sudo systemctl reboot"
sshx -h=prod-web --force "sudo systemctl reboot"

先确认:

  • 目标主机是否正确?
  • 命令是否被审阅?
  • 是否有维护窗口?
  • 是否有回滚方案?
  • 绕过原因是否被记录?

只要有一个答案是“否”,就先停下来修 runbook。--force 的含义应该是“我已经为这个目标审阅过这条命令”,而不是“让工具别再提醒我”。

Agent 和自动化规则

自动化应该比人类终端更保守:

  • 总是设置 --timeout
  • 优先使用 --json
  • 解析 successexit_codeerror_kind
  • 特权变更前先跑 --dry-run --json
  • 不要全局设置 SSH_INSECURE_HOST_KEY=1
  • 除非没有更安全路径且生命周期严格受控,否则不要通过环境变量传明文密码。
  • 对项目、迁移或事故运行,使用 --audit-output 保存审计事件。

审计边界

审计事件是本地 JSONL 溯源记录。它记录模式、动作、主机解析、sudo/keyring 决策、安全状态、认证方式、退出码、错误类型和耗时等元数据。

它刻意不记录:

  • 明文密码。
  • 私钥内容。
  • stdout。
  • stderr。

命令文本会作为溯源材料写入,并对常见 password/token 类参数做脱敏,但不要因此把 secret 放进命令。

SFTP 安全

上传到特权路径时,先暂存文件:

sshx -h=prod-web --upload=./service.conf --to=/tmp/service.conf
sshx -h=prod-web "sudo install -m 0644 /tmp/service.conf /etc/service/service.conf"

删除前先列目录:

sshx -h=prod-web --list=/tmp
sshx -h=prod-web --rm=/tmp/old-file

远程 SFTP 路径就是远程路径,不要套用本地操作系统路径规则。

事故响应检查表

当情况不对时:

  1. 停止用更弱的安全参数反复重试。
  2. 记录准确命令、退出码和 error_kind
  3. 检查 ~/.sshx/audit 或指定 --audit-output 里的审计事件。
  4. ssh-keygen -F <host> 验证 host-key 状态。
  5. 判断失败发生在 SSH 前、认证阶段、安全校验阶段、命令执行阶段,还是输出收集阶段。
  6. 如果 secret 可能进入 shell history、CI 日志、issue 文本或聊天记录,立即轮换相关凭据。

共享 Runbook 的好默认值

sshx -h=<named-host> \
  --timeout=30s \
  --audit-output=./.sshx-audit \
  --dry-run \
  --json \
  "sudo systemctl reload <service>"

计划审阅后,再执行真实命令:

sshx -h=<named-host> \
  --timeout=30s \
  --audit-output=./.sshx-audit \
  -pk=<sudo-key> \
  "sudo systemctl reload <service>"

故障排查

先判断失败边界:是 sshx 在远程命令运行前失败,还是远程命令已经运行但返回非零退出码?

获取结构化错误详情

sshx -h=prod-web --json "systemctl is-active nginx"

重点看:

  • success
  • exit_code
  • error_kind
  • stderr
  • auth_method

JSON 模式下,sshx 层面的失败会有 exit_code: -1 和非空 error_kind

Host Key 错误

症状:

  • 未知 host key。
  • host key 发生变化。
  • 认证前连接中断。

检查:

ssh-keygen -F prod-web
ssh-keyscan -H prod-web

只有确认主机符合预期后再修复。不要直接跳到 --insecure-hostkey

认证错误

检查解析后的主机和选择的 key:

sshx -h=prod-web --dry-run --json "whoami"

常见原因:

  • ~/.sshx/settings.json 中用户写错。
  • per-host key 路径错误。
  • key 文件权限不正确。
  • 服务端不接受选择的认证方式。
  • 误以为 keyring 里的 sudo 密码会当作 SSH 登录密码。

Keyring 密码用于 sudo 自动填充,不会被静默用作 SSH 登录密码。

Sudo 没有自动填充

只有命令以 sudo 开头,sshx 才会自动填充。

可以触发:

sshx -h=prod-web -pk=prod-web-sudo "sudo whoami"

不会触发:

sshx -h=prod-web "sh -c 'sudo whoami'"

检查 password key 是否存在:

sshx --password-check=prod-web-sudo

命令被阻止

通常是安全检查失败。

sshx -h=prod-web --dry-run --json "sudo rm -rf /"

如果特权或破坏性命令确实是预期操作,先审阅、记录原因,再只对这一次使用 --force

脚本卡住

设置 timeout:

sshx -h=prod-web --timeout=30s --json "long-running-command"

如果命令必须要终端语义,可以使用 --pty,但 PTY 模式不适合结构化自动化。

JSON 输出无法解析

普通 JSON 模式下,stdout 应该只包含一个 JSON 对象,诊断信息走 stderr。检查这些问题:

  • 是否使用了 --pty
  • 外层脚本是否在 sshx 前后打印了额外文本。
  • 调用方是否混合了 stdout 和 stderr。

SFTP 路径问题

本地文件使用本地路径规则。远程目标使用斜杠分隔的远程路径:

sshx -h=prod-web --upload=./file.txt --to=/tmp/file.txt

审计事件缺失

检查是否禁用了审计:

env | grep SSHX_NO_AUDIT

检查默认输出位置:

ls ~/.sshx/audit

如果使用项目内目录:

sshx -h=prod-web --audit-output=./.sshx-audit "uptime"
ls ./.sshx-audit

command not found

检查安装:

command -v sshx
sshx --version

如果通过 Go 安装,确认 ~/go/binGOPATH/bin 已加入 PATH