-
Notifications
You must be signed in to change notification settings - Fork 3
feat: btw memory #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: btw memory #66
Conversation
simonpcouch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very thorough for now, but will come back to this with a fresh mind tomorrow morning.
My initial expectation was that if I launched a chat and asked to add something to memory (without creating the file myself), the model would be able to use a tool to create a btw-memory.yaml file. Instead, I saw:
Warning: Failed to evaluate 2 tool calls.
✖ [btw_tool_memory_project_context_add
(toolu_01EA9ipXAxAHXAJBLDBc2yXY)]: cannot coerce type 'closure' to
vector of type 'character'
The @contents from that tool result:
<ellmer::ContentToolResult>
@ value : NULL
@ error :List of 2
.. $ message: chr "cannot coerce type 'closure' to vector of type 'character'"
.. $ call : language paste0(before, x, after)
.. - attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
@ extra : list()
@ request: <ellmer::ContentToolRequest>
.. @ id : chr "toolu_0143eU9keVgmG3wdi1ZQwyDm"
.. @ name : chr "btw_tool_memory_project_context_add"
.. @ arguments:List of 3
.. .. $ key : chr "problem_description"
.. .. $ content:List of 1
.. .. ..$ : chr "Develop the btw package with its source code located in the current directory."
.. .. $ intent : chr "Storing information about the btw package development project"
Totally! I did some refactoring before I committed and missed the "memory file doesn't exist yet" case in the |
simonpcouch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| } | ||
|
|
||
| yaml::write_yaml(data, path, indent.mapping.sequence = TRUE, indent = 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we think of this file as only to be edited by the model? Should there be some sort of cautionary comment at the top?
# Generated by btw: do not edit by hand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I ask is because the formatting here is quite precise (for good reason). Feels like, if a user wants the model to know something, they ought to drop it in btw.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I did want this file to be human editable (hence YAML and not JSON), but you're right that there's a precise structure to follow. I'm thinking we could a comment header in this file explaining the structure and also letting people know that the file may be overwritten by the tool – the biggest consequence is that the yaml package can't preserve comments in a round trip read/write (so we'd always inject the standard instructions header).
| } | ||
|
|
||
|
|
||
| path_find_btw_memory <- function(path = NULL, must_exist = TRUE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that in the list_files tool, we've just considered the working directory as the project directory:
btw/tests/testthat/_snaps/tool-files.md
Lines 22 to 25 in ee33182
| btw_tool_files_list_files("../") | |
| Condition | |
| Error in `check_path_within_current_wd()`: | |
| ! You are not allowed to list or read files outside of the project directory. Make sure that `path` is relative to the current working directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True but there are slightly different use cases. We use the "project directory" idea when looking for btw.md, but we don't want the list_files tool to access any file on the user's machine. Hence the working directory and below constraint on the list files tool.
Yeah great point. I considered this kind of approach, but I felt YAML was a better choice because it makes it easier for us to read, write and update just parts of the memory without resorting to parsing the markdown file. My other feeling is that as a support primarily for EDA, there's a lot of refinement that happens where both the user and the LLM start out not knowing much about the data and can learn together. My sense is that we wouldn't want to throw the whole memory at the LLM in the system prompt, but I could be wrong about that. Or maybe we want to include parts of the memory in the system prompt and the YAML file helps us do that filtering. |

Adds a minimal memory system using a single YAML file to store stable, user-provided facts about the analysis problem and dataset characteristics that won't change during the project lifecycle.
The memory is stored in a project-level YAML file (typically
btw-memory.yaml):This PR currently implements the project context memory; I've implemented parts of the data sources memory locally. But as a first experiment we can merge without the data sources memory.
Tool Functions
Project Context Tools
btw_tool_memory_project_context_add(key, content)- Append to project context sectionbtw_tool_memory_project_context_read(key)- Read project context (all or specific key)btw_tool_memory_project_context_replace(key, contents)- Replace entire project context sectionData Source Tools
btw_tool_memory_data_source_add(name, key, content)- Append to datasets or variablesbtw_tool_memory_data_source_read(name, key)- Read data sources (all or specific name/key)btw_tool_memory_data_source_replace(name, key, contents)- Replace entire data sources sectionbtw_tool_memory_data_source_variable_add(data_source_name, variable_name, note)- Add notes to a specific variablebtw_tool_memory_data_source_variable_replace(data_source_name, variable_name, notes)- Read notes for a specific variable