AI cannot be trusted with an incomplete sandbox

One of the motivating use cases is AI, which is to say, running sloppy but not actively hostile code:

> Encapsulate AI code to avoid it coming up with matching globals elsewhere or producing misguided attempts at polyfills inline
> Doesn't need to be security-grade isolation to contain impact of faulty code
> Generated code can interact, import and call functions regardless of the isolation

The assumption appears to be that such a sandbox is sufficient to contain the code written by an AI under your direction.

This, I think, is a serious misunderstanding of where AI is at today. LLMs, especially but not exclusively Claude, are [_notorious_](https://www.reddit.com/r/ClaudeAI/comments/1l5ieo5/claude_just_casually_deleted_my_test_file_to_stay/) for using whatever tools are available to them in order to meet their nominal objective. I am quite certain that if Claude decided it needed access to the outer global for some reason then it would have no compunctions about writing code which reached for `(function(){}).constructor` in order to get that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI cannot be trusted with an incomplete sandbox #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AI cannot be trusted with an incomplete sandbox #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions