-
Notifications
You must be signed in to change notification settings - Fork 2
Open
0 / 10 of 1 issue completedOpen
0 / 10 of 1 issue completed
Copy link
Description
The following test cases result in invalid directory names on Windows, which cause metacoder to fail when it reaches that test case. This can be expensive if this occurs toward the end of a large number of tests. In these specific cases, the colon (':') character is not a valid path character on Windows. Linux and MacOS will have their own failure cases.
Suggestions:
- Test name requirements should be added to the config file specification.
- The YAML config file should be validated in some way prior to starting the full test suite, in order to reject test cases that fail to conform to the correct specification.
- name: https://www.ncbi.nlm.nih.gov/books/NBK1256/_HTML
metrics:
- CorrectnessMetric
input: What are the last two rows of table 2
expected_output: "Behavior disorder/\nPsychosis 10% \nAltered mentation\n\
Impaired reality testing\nCone-rod\ndystrophy 70% \nLoss of central\
\ vision & color vision\nAbnormal fundoscopic exam"
threshold: 0.9
- name: PMID:40307501_Figure_Legend
metrics:
- CorrectnessMetric
input: What is the first sentence of figure 1 legend
expected_output: Proposed system for bio-accelerated weathering of ultramafic materials
for carbon mineralization
threshold: 0.9
Stack trace (for first case):
Progress: 6/25 - goose/claude-4-sonnet/https://www.ncbi.nlm.nih.gov/books/NBK1256/_HTML with servers: artl, simple-pubmed, ols
Running goose with claude-4-sonnet on case 'https://www.ncbi.nlm.nih.gov/books/NBK1256/_HTML'
📁 Preparing workdir: eval_workdir\claude-4-sonnet_goose_https:\www.ncbi.nlm.nih.gov\books\NBK1256\_HTML_artl_simple-pubmed_ols\claude-4-sonnet_goose_https:\www.ncbi.nlm.nih.gov\books\NBK1256\_HTML
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Scripts\metacoder.exe\__main__.py", line 10, in <module>
sys.exit(main())
~~~~^^
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\click\core.py", line 1442, in __call__
return self.main(*args, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\click\core.py", line 1363, in main
rv = self.invoke(ctx)
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\click\core.py", line 1830, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\click\core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\click\core.py", line 794, in invoke
return callback(*args, **kwargs)
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\metacoder\metacoder.py", line 587, in eval_command
results = runner.run_all_evals(dataset, workdir_path, coders_list)
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\metacoder\evals\runner.py", line 502, in run_all_evals
results = self.run_single_eval(
model_name,
...<4 lines>...
coder_config,
)
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\metacoder\evals\runner.py", line 279, in run_single_eval
output: CoderOutput = coder.run(case.input)
~~~~~~~~~^^^^^^^^^^^^
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\metacoder\coders\goose.py", line 146, in run
self.prepare_workdir()
~~~~~~~~~~~~~~~~~~~~^^
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\metacoder\coders\base_coder.py", line 356, in prepare_workdir
with change_directory(self.workdir):
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "C:\Users\CTParker\AppData\Roaming\uv\python\cpython-3.13.2-windows-x86_64-none\Lib\contextlib.py", line 141, in __enter__
return next(self.gen)
File "C:\Users\CTParker\PycharmProjects\mcp_literature_eval\.venv\Lib\site-packages\metacoder\coders\base_coder.py", line 66, in change_directory
Path(path).mkdir(parents=True, exist_ok=True)
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\CTParker\AppData\Roaming\uv\python\cpython-3.13.2-windows-x86_64-none\Lib\pathlib\_local.py", line 722, in mkdir
os.mkdir(self, mode)
~~~~~~~~^^^^^^^^^^^^
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'eval_workdir\\claude-4-sonnet_goose_https:\\www.ncbi.nlm.nih.gov\\books\\NBK1256\\_HTML_artl_simple-pubmed_ols\\claude-4-sonnet_goose_https:\\www.ncbi.nlm.nih.gov\\books\\NBK1256\\_HTML'
Reactions are currently unavailable
Sub-issues
Metadata
Metadata
Assignees
Labels
No labels