-
Notifications
You must be signed in to change notification settings - Fork 1.3k
tests: add final-status execution tests (done token and fail status) #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
9902f3d
5a0b42f
148cf8e
b62e0fa
e164262
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| name: CI | ||
|
|
||
| on: | ||
| push: | ||
| branches: [main, master] | ||
| pull_request: | ||
| branches: [main, master] | ||
|
|
||
| jobs: | ||
| test: | ||
| runs-on: ubuntu-latest | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - uses: actions/setup-python@v4 | ||
| with: | ||
| python-version: "3.11" | ||
|
|
||
| - name: Install dev requirements | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install -r requirements-dev.txt | ||
|
|
||
| - name: Run linters | ||
| env: | ||
| PYTHONPATH: ${{ github.workspace }} | ||
| run: | | ||
| python -m black --check . | ||
| python -m isort --check-only . | ||
| python -m flake8 . | ||
|
|
||
| - name: Run tests | ||
| env: | ||
| PYTHONPATH: ${{ github.workspace }} | ||
| run: | | ||
| python -m pytest -q | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -161,4 +161,7 @@ cython_debug/ | |||||
| # option (not recommended) you can uncomment the following to ignore the entire idea folder. | ||||||
| #.idea/ | ||||||
| logs/ | ||||||
| .DS_Store | ||||||
| .DS_Store | ||||||
|
|
||||||
| # Local env file for secrets | ||||||
| .env | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Duplicate .env entry. The 🔎 Proposed fix .DS_Store
-
-# Local env file for secrets
-.envThe existing entry at line 125 already covers environment files. 📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| From bd11c7fff67d0667527f5480bc2a71e8fbd197f9 Mon Sep 17 00:00:00 2001 | ||
| From: Vasyl198 <ananiivasilii@gmail.com> | ||
| Date: Wed, 24 Dec 2025 14:29:28 +0200 | ||
| Subject: [PATCH 2/2] ci: use pre-commit mirrors for flake8 and add basic hooks | ||
|
|
||
| --- | ||
| .pre-commit-config.yaml | 11 ++++++++--- | ||
| 1 file changed, 8 insertions(+), 3 deletions(-) | ||
|
|
||
| diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml | ||
| index 8665a49..7e0e7f6 100644 | ||
| --- a/.pre-commit-config.yaml | ||
| +++ b/.pre-commit-config.yaml | ||
| @@ -8,8 +8,13 @@ repos: | ||
| rev: 5.12.0 | ||
| hooks: | ||
| - id: isort | ||
| - - repo: https://gitlab.com/pycqa/flake8 | ||
| - rev: 6.0.0 | ||
| + - repo: https://github.com/pre-commit/mirrors-flake8 | ||
| + rev: 7.1.0 | ||
| hooks: | ||
| - id: flake8 | ||
| - args: ["--max-line-length=120"] | ||
| + args: ["--max-line-length=88", "--extend-ignore=E203,W503"] | ||
| + - repo: https://github.com/pre-commit/pre-commit-hooks | ||
| + rev: v4.6.0 | ||
| + hooks: | ||
| + - id: end-of-file-fixer | ||
| + - id: trailing-whitespace | ||
| -- | ||
| 2.49.0.windows.1 | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| repos: | ||
| - repo: https://github.com/psf/black | ||
| rev: 25.11.0 | ||
| hooks: | ||
| - id: black | ||
| language_version: python3.11 | ||
| - repo: https://github.com/PyCQA/isort | ||
| rev: 5.12.0 | ||
| hooks: | ||
| - id: isort | ||
| - repo: https://github.com/pre-commit/mirrors-flake8 | ||
| rev: 7.1.0 | ||
| hooks: | ||
| - id: flake8 | ||
| args: ["--max-line-length=88", "--extend-ignore=E203,W503"] | ||
| - repo: https://github.com/pre-commit/pre-commit-hooks | ||
| rev: v4.6.0 | ||
| hooks: | ||
| - id: end-of-file-fixer | ||
| - id: trailing-whitespace |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -145,14 +145,23 @@ def show_permission_dialog(code: str, action_description: str): | |||||||||||||||||||||
| return False | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| def execute_code(code_str: str): | ||||||||||||||||||||||
| """Execute the provided code string in a controlled namespace. | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| This helper centralizes code execution so it can be mocked in tests. | ||||||||||||||||||||||
| """ | ||||||||||||||||||||||
| # Execute in globals so that imports persist if needed by subsequent steps. | ||||||||||||||||||||||
| exec(code_str, globals()) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| def scale_screen_dimensions(width: int, height: int, max_dim_size: int): | ||||||||||||||||||||||
| scale_factor = min(max_dim_size / width, max_dim_size / height, 1) | ||||||||||||||||||||||
| safe_width = int(width * scale_factor) | ||||||||||||||||||||||
| safe_height = int(height * scale_factor) | ||||||||||||||||||||||
| return safe_width, safe_height | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| def run_agent(agent, instruction: str, scaled_width: int, scaled_height: int): | ||||||||||||||||||||||
| def run_agent(agent, instruction: str, scaled_width: int, scaled_height: int, require_exec_confirmation: bool = True): | ||||||||||||||||||||||
| global paused | ||||||||||||||||||||||
| obs = {} | ||||||||||||||||||||||
| traj = "Task:\n" + instruction | ||||||||||||||||||||||
|
|
@@ -182,8 +191,42 @@ def run_agent(agent, instruction: str, scaled_width: int, scaled_height: int): | |||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Get next action code from the agent | ||||||||||||||||||||||
| info, code = agent.predict(instruction=instruction, observation=obs) | ||||||||||||||||||||||
| print('DEBUG: agent.predict returned code:', repr(code)) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Normalize to (code_str, status) form. Some agents return [code, status] | ||||||||||||||||||||||
| code_str = None | ||||||||||||||||||||||
| status = None | ||||||||||||||||||||||
| try: | ||||||||||||||||||||||
| if isinstance(code, (list, tuple)) and len(code) >= 1: | ||||||||||||||||||||||
| code_str = code[0] | ||||||||||||||||||||||
| if len(code) > 1: | ||||||||||||||||||||||
| status = str(code[1]).lower().strip() | ||||||||||||||||||||||
| elif isinstance(code, str): | ||||||||||||||||||||||
| code_str = code | ||||||||||||||||||||||
| except Exception: | ||||||||||||||||||||||
| code_str = str(code) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Interpret explicit status when present, or exact token in code_str | ||||||||||||||||||||||
| # Semantics: if code_str itself is a terminal token ("done"/"fail"), stop without executing. | ||||||||||||||||||||||
| # If an explicit status is provided (e.g., [code, "done"]), execute the provided code_str once, then stop. | ||||||||||||||||||||||
| if isinstance(code_str, str) and code_str.strip().lower() in ("done", "fail"): | ||||||||||||||||||||||
| break | ||||||||||||||||||||||
| if status in ("done", "fail"): | ||||||||||||||||||||||
| # Execute the final code, then stop | ||||||||||||||||||||||
| execute_final = code_str | ||||||||||||||||||||||
| # fall through to execution branch below | ||||||||||||||||||||||
| final_exit_after_exec = True | ||||||||||||||||||||||
| else: | ||||||||||||||||||||||
| execute_final = None | ||||||||||||||||||||||
| final_exit_after_exec = False | ||||||||||||||||||||||
|
Comment on lines
+218
to
+221
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unused variable: This variable is set but 🔎 Proposed fix: remove unused variable if status in ("done", "fail"):
# Execute the final code, then stop
execute_final = code_str
- # fall through to execution branch below
- final_exit_after_exec = True
+ # fall through to execution branch below; do_exit_after set below
else:
execute_final = None
- final_exit_after_exec = False📝 Committable suggestion
Suggested change
🧰 Tools🪛 Ruff (0.14.10)221-221: Local variable Remove assignment to unused variable (F841) 🤖 Prompt for AI Agents |
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| if "done" in code[0].lower() or "fail" in code[0].lower(): | ||||||||||||||||||||||
| # If execute_final is set, we want to execute it once and then exit the loop | ||||||||||||||||||||||
| if execute_final is not None: | ||||||||||||||||||||||
| code_to_run = execute_final | ||||||||||||||||||||||
| do_exit_after = True | ||||||||||||||||||||||
| else: | ||||||||||||||||||||||
| code_to_run = code_str | ||||||||||||||||||||||
| do_exit_after = False | ||||||||||||||||||||||
| if platform.system() == "Darwin": | ||||||||||||||||||||||
| os.system( | ||||||||||||||||||||||
| f'osascript -e \'display dialog "Task Completed" with title "OpenACI Agent" buttons "OK" default button "OK"\'' | ||||||||||||||||||||||
|
|
@@ -205,14 +248,36 @@ def run_agent(agent, instruction: str, scaled_width: int, scaled_height: int): | |||||||||||||||||||||
|
|
||||||||||||||||||||||
| else: | ||||||||||||||||||||||
| time.sleep(1.0) | ||||||||||||||||||||||
| print("EXECUTING CODE:", code[0]) | ||||||||||||||||||||||
| print("EXECUTING CODE:", code_to_run) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Check for pause state before execution | ||||||||||||||||||||||
| while paused: | ||||||||||||||||||||||
| time.sleep(0.1) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Ask for permission before executing | ||||||||||||||||||||||
| exec(code[0]) | ||||||||||||||||||||||
| allowed = True | ||||||||||||||||||||||
| if require_exec_confirmation: | ||||||||||||||||||||||
| # Try platform GUI confirmation first | ||||||||||||||||||||||
| try: | ||||||||||||||||||||||
| allowed = show_permission_dialog(code_to_run, "execute this action") | ||||||||||||||||||||||
| except Exception: | ||||||||||||||||||||||
| allowed = False | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| if not allowed: | ||||||||||||||||||||||
| try: | ||||||||||||||||||||||
| print("Agent proposes to execute the following code:\n") | ||||||||||||||||||||||
| print(code_to_run) | ||||||||||||||||||||||
| resp = input("Execute this code? (y/N): ") | ||||||||||||||||||||||
| if resp.lower().strip() == "y": | ||||||||||||||||||||||
| allowed = True | ||||||||||||||||||||||
| except Exception: | ||||||||||||||||||||||
| allowed = False | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| if not allowed: | ||||||||||||||||||||||
| print("Execution denied by user; skipping this action.") | ||||||||||||||||||||||
| else: | ||||||||||||||||||||||
| execute_code(code_to_run) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| time.sleep(1.0) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Update task and subtask trajectories | ||||||||||||||||||||||
|
|
@@ -224,6 +289,10 @@ def run_agent(agent, instruction: str, scaled_width: int, scaled_height: int): | |||||||||||||||||||||
| + info["executor_plan"] | ||||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # If this was a final command (status indicated done/fail), exit after executing | ||||||||||||||||||||||
| if do_exit_after: | ||||||||||||||||||||||
| break | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| def main(): | ||||||||||||||||||||||
| parser = argparse.ArgumentParser(description="Run AgentS3 with specified model.") | ||||||||||||||||||||||
|
|
@@ -316,6 +385,13 @@ def main(): | |||||||||||||||||||||
| help="Enable local coding environment for code execution (WARNING: Executes arbitrary code locally)", | ||||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| parser.add_argument( | ||||||||||||||||||||||
| "--allow_exec", | ||||||||||||||||||||||
| action="store_true", | ||||||||||||||||||||||
| default=False, | ||||||||||||||||||||||
| help="Allow executing agent-generated code without confirmation (dangerous).", | ||||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| args = parser.parse_args() | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Re-scales screenshot size to ensure it fits in UI-TARS context limit | ||||||||||||||||||||||
|
|
@@ -368,13 +444,16 @@ def main(): | |||||||||||||||||||||
| enable_reflection=args.enable_reflection, | ||||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Determine whether user approval is required before executing agent code | ||||||||||||||||||||||
| require_exec_confirmation = not getattr(args, "allow_exec", False) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| while True: | ||||||||||||||||||||||
| query = input("Query: ") | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| agent.reset() | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Run the agent on your own device | ||||||||||||||||||||||
| run_agent(agent, query, scaled_width, scaled_height) | ||||||||||||||||||||||
| run_agent(agent, query, scaled_width, scaled_height, require_exec_confirmation=require_exec_confirmation) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| response = input("Would you like to provide another query? (y/n): ") | ||||||||||||||||||||||
| if response.lower() != "y": | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| pytest | ||
| pillow | ||
| black | ||
| flake8 | ||
| isort | ||
| pre-commit |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| <!doctype html> | ||
| <html lang="ru"> | ||
| <head> | ||
| <meta charset="utf-8"> | ||
| <meta name="viewport" content="width=device-width, initial-scale=1"> | ||
| <title>Ресторан "У моря"</title> | ||
| <style> | ||
| :root { --accent: #ff6b6b; --muted: rgba(255,255,255,0.85) } | ||
| html,body { height:100%; margin:0 } | ||
| body { | ||
| font-family: Arial, Helvetica, sans-serif; | ||
| color: var(--muted); | ||
| background: url('https://images.unsplash.com/photo-1507525428034-b723cf961d3?w=1600&q=80&auto=format&fit=crop') center/cover fixed no-repeat; | ||
| -webkit-font-smoothing:antialiased; | ||
| -moz-osx-font-smoothing:grayscale; | ||
| } | ||
| .overlay { background: rgba(0,0,0,0.38); min-height:100%; } | ||
| header { display:flex; align-items:center; justify-content:space-between; padding:1rem 2rem; } | ||
| header h1{ margin:0; font-size:1.6rem } | ||
| nav a{ color:var(--muted); margin-left:1rem; text-decoration:none; font-weight:600 } | ||
|
|
||
| .hero { display:flex; gap:2rem; align-items:center; padding:6rem 2rem; max-width:1200px; margin:0 auto } | ||
| .hero .intro{ max-width:640px } | ||
| .hero h2{ font-size:2.4rem; margin:0 0 .5rem; text-shadow:0 6px 24px rgba(0,0,0,0.6) } | ||
| .hero p{ margin:.5rem 0 1rem; line-height:1.45 } | ||
| .btn{ display:inline-block; background:var(--accent); color:#fff; padding:.75rem 1.1rem; border-radius:999px; text-decoration:none; font-weight:700 } | ||
|
|
||
| .section{ padding:3rem 2rem; max-width:1100px; margin:0 auto } | ||
| .menu{ display:flex; gap:1rem; flex-wrap:wrap } | ||
| .menu-item{ background:rgba(255,255,255,0.04); padding:1rem; border-radius:10px; flex:1 1 240px; transition:transform .18s ease, box-shadow .18s ease } | ||
| .menu-item:hover{ transform:translateY(-8px) scale(1.02); box-shadow:0 12px 30px rgba(0,0,0,0.5) } | ||
| .menu-item h4{ margin:.2rem 0 } | ||
|
|
||
| footer{ text-align:center; padding:2rem; color:rgba(255,255,255,0.8) } | ||
|
|
||
| /* small screens */ | ||
| @media (max-width:700px){ | ||
| .hero{ padding:3rem 1rem; flex-direction:column; text-align:center } | ||
| nav{ display:none } | ||
| } | ||
|
|
||
| /* Reserve modal */ | ||
| .modal{ position:fixed; left:0; top:0; right:0; bottom:0; display:flex; align-items:center; justify-content:center; background:rgba(0,0,0,0.6); opacity:0; pointer-events:none; transition:opacity .2s } | ||
| .modal.open{ opacity:1; pointer-events:auto } | ||
| .modal .card{ background:#fff; color:#222; padding:1.6rem; border-radius:8px; min-width:280px; max-width:420px } | ||
| .close{ background:#eee;border-radius:6px;padding:.4rem .6rem;cursor:pointer } | ||
| </style> | ||
| </head> | ||
| <body> | ||
| <div class="overlay"> | ||
| <header> | ||
| <h1>Ресторан "У моря"</h1> | ||
| <nav> | ||
| <a href="#menu">Меню</a> | ||
| <a href="#contacts">Контакты</a> | ||
| <a href="#reserve" id="reserveBtn" class="btn">Забронировать</a> | ||
| </nav> | ||
| </header> | ||
|
|
||
| <main> | ||
| <section class="hero"> | ||
| <div class="intro"> | ||
| <h2>Свежая еда, уютная атмосфера и вид на море</h2> | ||
| <p>Насладитесь авторскими блюдами из местных ингредиентов, приготовленными с любовью нашим шеф-поваром.</p> | ||
| <p><strong>Часы работы:</strong> 10:00 — 23:00</p> | ||
| <p>Текущее время: <span id="now"></span></p> | ||
| <a href="#menu" class="btn">Посмотреть меню</a> | ||
| </div> | ||
| <div class="visual" aria-hidden="true"> | ||
| <img src="https://images.unsplash.com/photo-1498654896293-37aacf113fd9?w=800&q=80&auto=format&fit=crop" alt="restaurant" style="width:320px;border-radius:8px;box-shadow:0 10px 30px rgba(0,0,0,0.6)"> | ||
| </div> | ||
| </section> | ||
|
|
||
| <section id="menu" class="section"> | ||
| <h3>Меню</h3> | ||
| <div class="menu"> | ||
| <div class="menu-item"><h4>Салат из лосося</h4><p>Свежий лосось, микс салатов, цитрусовая заправка — 320 ₴</p></div> | ||
| <div class="menu-item"><h4>Паста морская</h4><p>Паста с морепродуктами в сливочном соусе — 380 ₴</p></div> | ||
| <div class="menu-item"><h4>Стейк</h4><p>Говяжий стейк с овощами — 420 ₴</p></div> | ||
| <div class="menu-item"><h4>Десерт дня</h4><p>Творожный чизкейк с ягодным соусом — 150 ₴</p></div> | ||
| </div> | ||
| </section> | ||
|
|
||
| <section id="contacts" class="section"> | ||
| <h3>Контакты</h3> | ||
| <p>Адрес: Одесса, набережная, 1</p> | ||
| <p>Телефон: <a href="tel:+380000000000" style="color:var(--muted);text-decoration:underline">+38 0XX XXX XX XX</a></p> | ||
| </section> | ||
| </main> | ||
|
|
||
| <footer> | ||
| © Ресторан "У моря" — Приятного аппетита! | ||
| </footer> | ||
|
|
||
| <div id="reserveModal" class="modal" role="dialog" aria-hidden="true"> | ||
| <div class="card"> | ||
| <h4>Бронирование</h4> | ||
| <p>Пожалуйста, оставьте номер — мы свяжемся с вами для подтверждения.</p> | ||
| <div style="display:flex;gap:.5rem;margin-top:.6rem"> | ||
| <input id="phone" placeholder="Ваш телефон" style="flex:1;padding:.5rem;border:1px solid #ddd;border-radius:6px"> | ||
| <button id="sendReserve" class="btn">Отправить</button> | ||
| </div> | ||
| <div style="margin-top:.6rem;text-align:right"><button id="closeModal" class="close">Закрыть</button></div> | ||
| </div> | ||
| </div> | ||
| </div> | ||
|
|
||
| <script> | ||
| // Live clock | ||
| document.addEventListener('DOMContentLoaded', function(){ | ||
| var nowEl = document.getElementById('now'); | ||
| function tick(){ nowEl.textContent = new Date().toLocaleTimeString(); } | ||
| tick(); setInterval(tick,1000); | ||
|
|
||
| var reserveBtn = document.getElementById('reserveBtn'); | ||
| var modal = document.getElementById('reserveModal'); | ||
| var close = document.getElementById('closeModal'); | ||
| var send = document.getElementById('sendReserve'); | ||
| reserveBtn && reserveBtn.addEventListener('click', function(e){ e.preventDefault(); modal.classList.add('open'); modal.setAttribute('aria-hidden','false'); }); | ||
| close && close.addEventListener('click', function(){ modal.classList.remove('open'); modal.setAttribute('aria-hidden','true'); }); | ||
| send && send.addEventListener('click', function(){ alert('Спасибо! Мы свяжемся с вами для подтверждения брони.'); modal.classList.remove('open'); modal.setAttribute('aria-hidden','true'); }); | ||
|
|
||
| // Smooth scroll for anchor links | ||
| document.querySelectorAll('a[href^="#"]').forEach(function(a){ a.addEventListener('click', function(e){ var target = document.querySelector(this.getAttribute('href')); if(target){ e.preventDefault(); target.scrollIntoView({behavior:'smooth'}); } }); }); | ||
| }); | ||
| </script> | ||
| </body> | ||
| </html> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to actions/setup-python@v5.
The
actions/setup-python@v4action runner is deprecated and too old to run on current GitHub Actions infrastructure.🔎 Proposed fix
📝 Committable suggestion
🧰 Tools
🪛 actionlint (1.7.9)
15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🤖 Prompt for AI Agents