Skip to content

Conversation

@l-bat
Copy link
Contributor

@l-bat l-bat commented Jun 18, 2025

Ticket: 161649

@l-bat l-bat marked this pull request as draft June 18, 2025 08:47
@l-bat l-bat force-pushed the lt/eval_milebench branch from db5b244 to 9ac0082 Compare June 24, 2025 08:06
@github-actions github-actions bot added category: llm_bench Label for tool/llm_bench folder category: Image generation samples GenAI Image generation samples labels Jun 24, 2025
@l-bat l-bat marked this pull request as ready for review June 24, 2025 14:35
@l-bat l-bat requested a review from vshampor June 24, 2025 14:50
@l-bat l-bat force-pushed the lt/eval_milebench branch 4 times, most recently from 65df084 to a740b42 Compare June 27, 2025 07:56
@Wovchena Wovchena requested a review from Copilot July 3, 2025 12:38

This comment was marked as outdated.

@l-bat l-bat force-pushed the lt/eval_milebench branch 2 times, most recently from 0cad800 to 2cb9233 Compare July 7, 2025 07:10
@alexsu52 alexsu52 requested a review from Wovchena July 7, 2025 12:01
@l-bat l-bat force-pushed the lt/eval_milebench branch 3 times, most recently from 92f191d to eac1c80 Compare July 15, 2025 14:02
@github-actions github-actions bot added the category: GHA CI based on Github actions label Jul 15, 2025
@Wovchena Wovchena requested a review from Copilot July 17, 2025 06:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds MileBench dataset support and validation for visual-language models (VLMs), integrates new MileBench benchmarks into existing cache eviction tests, and updates fixtures/workflows to handle MileBench data.

  • Introduce utils/milebench.py with MileBenchDataset and Eval for data loading and evaluation.
  • Extend test_kv_cache_eviction.py to run MileBench benchmarks via parametrized tests.
  • Update fixtures (conftest.py) and CI workflows to download/extract MileBench parts and adjust timeouts.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/python_tests/utils/milebench.py Added MileBenchDataset and Eval classes for dataset handling and scoring.
tests/python_tests/test_kv_cache_eviction.py Integrated new MileBench benchmarks into cache eviction tests.
tests/python_tests/conftest.py Extended TEST_FILES and download_test_content fixture for MileBench archives.
tests/python_tests/samples/conftest.py Removed obsolete download_test_content imports in sample tests.
.github/workflows/*.{yml} Increased timeout for Cacheopt E2E CI jobs to accommodate longer tests.
Comments suppressed due to low confidence (2)

tests/python_tests/utils/milebench.py:103

  • [nitpick] The class name Eval is very generic and could collide with other utilities; consider renaming it to something more descriptive like MileBenchEvaluator.
class Eval:

tests/python_tests/utils/milebench.py:19

  • [nitpick] New MileBenchDataset and Eval utilities lack dedicated unit tests; consider adding tests that verify string transformation, image preprocessing, and choice matching logic.
class MileBenchDataset:

@l-bat l-bat force-pushed the lt/eval_milebench branch 3 times, most recently from f838dc3 to 0a5a268 Compare July 22, 2025 10:05
Copilot AI review requested due to automatic review settings December 3, 2025 17:33
@github-actions github-actions bot added category: VLM samples GenAI VLM samples and removed category: continuous batching Continuous batching labels Dec 3, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@l-bat l-bat force-pushed the lt/eval_milebench branch from 631718c to ce1be2f Compare December 3, 2025 17:37
Copilot AI review requested due to automatic review settings December 3, 2025 17:44
@l-bat l-bat force-pushed the lt/eval_milebench branch from ce1be2f to 037dd74 Compare December 3, 2025 17:44
@github-actions github-actions bot removed the category: Image generation samples GenAI Image generation samples label Dec 3, 2025

This comment was marked as duplicate.

@l-bat l-bat force-pushed the lt/eval_milebench branch from 037dd74 to cf2a650 Compare December 3, 2025 17:52
@openvinotoolkit openvinotoolkit deleted a comment from Copilot AI Dec 3, 2025
@openvinotoolkit openvinotoolkit deleted a comment from Copilot AI Dec 3, 2025
Copilot AI review requested due to automatic review settings December 3, 2025 17:57
@l-bat l-bat force-pushed the lt/eval_milebench branch from cf2a650 to 7283a08 Compare December 3, 2025 17:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot removed category: llm_bench Label for tool/llm_bench folder category: GGUF GGUF file reader labels Dec 3, 2025
@openvinotoolkit openvinotoolkit deleted a comment from Copilot AI Dec 3, 2025
@l-bat l-bat requested a review from Copilot December 3, 2025 18:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@l-bat l-bat force-pushed the lt/eval_milebench branch from 5ed0a30 to 597bfe2 Compare December 3, 2025 20:41
@l-bat l-bat added this pull request to the merge queue Dec 4, 2025
Merged via the queue into openvinotoolkit:master with commit b160eee Dec 4, 2025
96 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: VLM samples GenAI VLM samples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants