Skip to content

AI cannot be trusted with an incomplete sandbox #10

@bakkot

Description

@bakkot

One of the motivating use cases is AI, which is to say, running sloppy but not actively hostile code:

Encapsulate AI code to avoid it coming up with matching globals elsewhere or producing misguided attempts at polyfills inline
Doesn't need to be security-grade isolation to contain impact of faulty code
Generated code can interact, import and call functions regardless of the isolation

The assumption appears to be that such a sandbox is sufficient to contain the code written by an AI under your direction.

This, I think, is a serious misunderstanding of where AI is at today. LLMs, especially but not exclusively Claude, are notorious for using whatever tools are available to them in order to meet their nominal objective. I am quite certain that if Claude decided it needed access to the outer global for some reason then it would have no compunctions about writing code which reached for (function(){}).constructor in order to get that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions