feat: btw memory #66

gadenbuie · 2025-06-02T17:03:57Z

Adds a minimal memory system using a single YAML file to store stable, user-provided facts about the analysis problem and dataset characteristics that won't change during the project lifecycle.

The memory is stored in a project-level YAML file (typically btw-memory.yaml):

project_context:
  problem_description: <string>       # High-level description of the business problem to solve
  objectives: <array[string]>         # Specific analysis goals and questions to answer
  success_criteria: <array[string]>   # How to measure if the analysis was successful
  constraints: <array[string]>        # Limitations, requirements, or restrictions for the analysis
  business_context: <array[string]>   # Domain knowledge, organizational context, or background information

data_sources: <array[DataSource]>
  # where DataSource:
  - name: <string>                    # Unique identifier for the dataset
    description: <string>             # Human-readable description of the dataset
    source: <string>                  # File path, SQL table name, or source description
    code: <string|multiline>          # R code to load this dataset
    notes: <array[string]>            # Contextual notes about this dataset
    variables: <array[Variable]>      # Array of variable metadata
      # where Variable:
      - name: <string>                # Variable/column name
        notes: <array[string]>        # Array of contextual notes about this variable

This PR currently implements the project context memory; I've implemented parts of the data sources memory locally. But as a first experiment we can merge without the data sources memory.

Tool Functions

Project Context Tools

btw_tool_memory_project_context_add(key, content) - Append to project context section
btw_tool_memory_project_context_read(key) - Read project context (all or specific key)
btw_tool_memory_project_context_replace(key, contents) - Replace entire project context section

Data Source Tools

btw_tool_memory_data_source_add(name, key, content) - Append to datasets or variables
btw_tool_memory_data_source_read(name, key) - Read data sources (all or specific name/key)
btw_tool_memory_data_source_replace(name, key, contents) - Replace entire data sources section
btw_tool_memory_data_source_variable_add(data_source_name, variable_name, note) - Add notes to a specific variable
btw_tool_memory_data_source_variable_replace(data_source_name, variable_name, notes) - Read notes for a specific variable

simonpcouch

Not very thorough for now, but will come back to this with a fresh mind tomorrow morning.

My initial expectation was that if I launched a chat and asked to add something to memory (without creating the file myself), the model would be able to use a tool to create a btw-memory.yaml file. Instead, I saw:

Warning: Failed to evaluate 2 tool calls.
✖ [btw_tool_memory_project_context_add
  (toolu_01EA9ipXAxAHXAJBLDBc2yXY)]: cannot coerce type 'closure' to
  vector of type 'character'

The @contents from that tool result:

<ellmer::ContentToolResult>
 @ value  : NULL
 @ error  :List of 2
 .. $ message: chr "cannot coerce type 'closure' to vector of type 'character'"
 .. $ call   : language paste0(before, x, after)
 .. - attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
 @ extra  : list()
 @ request: <ellmer::ContentToolRequest>
 .. @ id       : chr "toolu_0143eU9keVgmG3wdi1ZQwyDm"
 .. @ name     : chr "btw_tool_memory_project_context_add"
 .. @ arguments:List of 3
 .. .. $ key    : chr "problem_description"
 .. .. $ content:List of 1
 .. ..  ..$ : chr "Develop the btw package with its source code located in the current directory."
 .. .. $ intent : chr "Storing information about the btw package development project"

.gitignore

gadenbuie · 2025-06-02T21:27:09Z

My initial expectation was that if I launched a chat and asked to add something to memory (without creating the file myself), the model would be able to use a tool to create a btw-memory.yaml file.

Totally! I did some refactoring before I committed and missed the "memory file doesn't exist yet" case in the _add and _replace() verbs. Should be fixed for you for tomorrow.

simonpcouch

Could we add a couple lines somewhere in the documentation that clarifies the distinction between btw.md and btw's memory? Before playing with the PR, I had anticipated that "memory" would have a meaning like Claude Code's, where the tool can just integrate content into the claude.md file.

simonpcouch · 2025-06-03T13:13:00Z

R/tool-memory.R

+    }
+  }
+
+  yaml::write_yaml(data, path, indent.mapping.sequence = TRUE, indent = 2)


Do we think of this file as only to be edited by the model? Should there be some sort of cautionary comment at the top?

# Generated by btw: do not edit by hand

The reason I ask is because the formatting here is quite precise (for good reason). Feels like, if a user wants the model to know something, they ought to drop it in btw.md?

Good idea! I did want this file to be human editable (hence YAML and not JSON), but you're right that there's a precise structure to follow. I'm thinking we could a comment header in this file explaining the structure and also letting people know that the file may be overwritten by the tool – the biggest consequence is that the yaml package can't preserve comments in a round trip read/write (so we'd always inject the standard instructions header).

R/tool-memory.R

simonpcouch · 2025-06-03T13:29:15Z

R/tool-memory.R

+}
+
+
+path_find_btw_memory <- function(path = NULL, must_exist = TRUE) {


Noting that in the list_files tool, we've just considered the working directory as the project directory:

btw/tests/testthat/_snaps/tool-files.md

Lines 22 to 25 in ee33182

btw_tool_files_list_files("../")

Condition

Error in `check_path_within_current_wd()`:

! You are not allowed to list or read files outside of the project directory. Make sure that `path` is relative to the current working directory.

True but there are slightly different use cases. We use the "project directory" idea when looking for btw.md, but we don't want the list_files tool to access any file on the user's machine. Hence the working directory and below constraint on the list files tool.

gadenbuie · 2025-06-03T13:52:18Z

btw.md and btw's memory? Before playing with the PR, I had anticipated that "memory" would have a meaning like Claude Code's, where the tool can just integrate content into the claude.md file.

Yeah great point. I considered this kind of approach, but I felt YAML was a better choice because it makes it easier for us to read, write and update just parts of the memory without resorting to parsing the markdown file.

My other feeling is that as a support primarily for EDA, there's a lot of refinement that happens where both the user and the LLM start out not knowing much about the data and can learn together. My sense is that we wouldn't want to throw the whole memory at the LLM in the system prompt, but I could be wrong about that. Or maybe we want to include parts of the memory in the system prompt and the YAML file helps us do that filtering.

gadenbuie added 4 commits June 2, 2025 11:49

feat: read/write btw memory

90ff99b

feat: project context memory

2d4d581

docs: fix tool description in docs

33efe7a

chore: add annotations

6c2ed44

simonpcouch reviewed Jun 2, 2025

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

gadenbuie added 3 commits June 2, 2025 17:24

fix: writing to memory when the memory file doesn't exist yet

056edfe

refactor: use full name memory_data

fe36db4

chore: don't ignore dot files

f3f43ed

simonpcouch approved these changes Jun 3, 2025

View reviewed changes

chore: fixes from code review

1621333

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: btw memory #66

feat: btw memory #66

Uh oh!

gadenbuie commented Jun 2, 2025 •

edited

Loading

Uh oh!

simonpcouch left a comment

Uh oh!

Uh oh!

gadenbuie commented Jun 2, 2025

Uh oh!

simonpcouch left a comment

Uh oh!

simonpcouch Jun 3, 2025

Uh oh!

simonpcouch Jun 3, 2025

Uh oh!

gadenbuie Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!

simonpcouch Jun 3, 2025

Uh oh!

gadenbuie Jun 3, 2025

Uh oh!

gadenbuie commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		}


		path_find_btw_memory <- function(path = NULL, must_exist = TRUE) {

	btw_tool_files_list_files("../")
	Condition
	Error in `check_path_within_current_wd()`:
	! You are not allowed to list or read files outside of the project directory. Make sure that `path` is relative to the current working directory.

feat: btw memory #66

Are you sure you want to change the base?

feat: btw memory #66

Uh oh!

Conversation

gadenbuie commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tool Functions

Project Context Tools

Data Source Tools

Uh oh!

simonpcouch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gadenbuie commented Jun 2, 2025

Uh oh!

simonpcouch left a comment

Choose a reason for hiding this comment

Uh oh!

simonpcouch Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

simonpcouch Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

gadenbuie Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

simonpcouch Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

gadenbuie Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

gadenbuie commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gadenbuie commented Jun 2, 2025 •

edited

Loading