Claude Is Scary Too

When “Safer” AI Starts Asking the Wrong Questions

Apr 26, 2026

Readers of this Substack know I’ve been using OpenClaw as my main AI agent. Because of how powerful it is, I restrict OpenClaw to running somewhat autonomously on an isolated Mac mini, which has very controlled access to my data. Recently, I started running Claude Desktop on my MacBook Air, and I’ve been using it when I want something interactive that can see more real data. The working assumption was simple enough. Claude felt like the safer option. Until I saw this message.

*Claude realized it didn’t have a specific capability and proposed searching my disk for credentials instead*

After realizing it didn’t have the capabilities because of my configuration, Claude asked whether it could search my disk for an API key so it could call a backend service directly. In other words, it wanted to step outside the controls I had deliberately put in place.

I only caught it because I happened to be reading carefully enough to stop it.

What made this uncomfortable wasn’t malice. Claude wasn’t trying to do anything sneaky. It thought it was being helpful. After I declined, it even explained why scanning my disk for credentials would be dangerous. But it still asked anyway. What?

*Claude understood the risk and still asked to do it anyway*

More advanced OpenClaw users are warned repeatedly about how much power they are handing over. Claude is positioned as something more approachable for normal users. That shifts the risk in a subtle way.

Too many warnings

Claude prompts constantly. Almost every meaningful action involves a confirmation dialog. Over time, these confirmations stop standing out, and admittedly, I start approving them reflexively.

Security people have been talking about alert fatigue for years, and Claude itself understands the concept.

*Too many prompts train users not to read them closely*

The problem is that truly risky requests start to feel indistinguishable from routine ones.

In my case, asking to search a configuration file that might contain credentials looked no different from any other harmless prompt. That puts an unfair burden on the user.

Good interfaces tend to interrupt sparingly, explain clearly what is happening, and make dangerous actions feel meaningfully different. We are not quite there yet in the AI world.

I declined the request, and Claude backed off. Still, this moment stuck with me.

Controlling access in the backend

One thing I try to do is limit what backend services are allowed to do in the first place. Some platforms make this relatively easy by offering fine-grained API permissions.

Microsoft offers a good example. I’ve allowed Claude access to my Outlook contacts. That works fine and it’s useful.

*Explicit permission to view my Outlook contacts only*

When I try to push it further and get it to read my Outlook email, the backend simply refuses.

*The service stops the request before Claude can even try*

That’s the system doing its job. Unfortunately, many services don’t work this way.

Secrets in plain text on disk

A lot of tools take a simpler approach. With these simpler tools, once an API key exists, it unlocks everything. MCP gateways, which provide the integration of popular software with AI, can define or restrict what an AI is supposed to do, but they don’t stop the AI from looking for the key itself and calling a service directly.

Claude Desktop makes this easier than it should.

By default, it stores keys in plain text inside a file called claude_desktop_config.json. That file is readable by the AI itself. (The screenshot below isn’t my real setup.)

*If the key is in the configuration file, the AI can see it*

If the key is there, the AI can find it, cache it, and reuse it to call backend services directly. At that point, the specific intent and safeguards built into skills or MCP gateways start to matter a lot less.

Fetching keys at runtime

Rather than storing credentials on disk, I changed my Claude Desktop setup to fetch keys at runtime. There are no unencrypted secrets sitting around anymore.

My Claude configuration is much simpler than my OpenClaw setup. I use 1Password Environments to keep the small number of keys that my Claude configuration needs in one place.

*Secrets live in 1Password, not on in plain text on disk*

Instead of having 1Password prompt me repeatedly when it fetches keys, I rely on a simple service account. A wrapper script pulls a token from the macOS keychain, fetches what it needs from 1Password, and lets 1Password manage every access without requiring me to manually approve it.

If you enjoy geeking out, some details are below. Feel free to skip them.

*A wrapper script to provide a thin layer between Claude and the keys*

Note: The ENV_ID for my claude-mcp 1Password Environment is not a secret, as it cannot be accessed without the OP_SERVICE_ACCOUNT token, which is a secret and encrypted inside the macOS keychain.

*No unencrypted keys in JSON for the AI to use elsewhere*

Note: that I had to use a different wrapper script for a remote MCP server (hillmeet) that used an authorization bearer token with its https requests, rather than just requiring a local environment variable.

Once these changes were in place, things got quiet in a good way. There were no unprotected keys in JSON that the AI could just rummage through.

Why this feels better

With this one set of changes, a few things improved immediately.

Disk security. No secrets are stored in plain text.

Auditable. Every access can be logged. (The 1Password Environments feature is still in beta, and its audit logs had not yet been released at the time of publication.)

Rotation. Keys live in one place, so it will be easy for me to rotate them.

Sharing. Setups can move across computers without copying credentials.

Maybe this is erring on the side of caution. What struck me was how easily Claude could explain why letting it scan my disk for credentials was a bad idea, even as it was asking to do exactly that.

My takeaway

For full disclosure, I have a professional background in cybersecurity. Still, I understand why people treat AI agents like hobby projects. However, once those agents start talking to real backend services, the stack starts to deserve a bit more discipline.

It’s not because the AI is malicious. The AI just happens to try very hard to be helpful.

I didn’t stop using Claude. I just stopped letting it have unfettered access to my keys.

Discussion about this post

Ready for more?