Skip to content

Conversation

@YoungVor
Copy link
Contributor

@YoungVor YoungVor commented Oct 12, 2025

TL;DR

Added a new PDF processing demo notebook and improved PDF parsing capabilities across multiple LLM providers.

What changed?

  • Added a new notebook 18_pdf_processing.ipynb that demonstrates PDF to markdown conversion, content categorization, and structured extraction
  • Enhanced the PDF parsing system prompt to better preserve document structure and formatting

How to test?

  1. Run the new notebook examples/fenic_in_120_seconds/18_pdf_processing.ipynb
  2. Verify that PDFs are properly converted to markdown with preserved structure
  3. Check that content categorization correctly identifies document sections, products mentioned, and training methods

Copy link
Contributor Author

YoungVor commented Oct 12, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch 2 times, most recently from 8e53f1d to eef48d2 Compare October 13, 2025 16:20
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from 3ed6592 to a9128d0 Compare October 13, 2025 16:30
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from eef48d2 to 50bd971 Compare October 13, 2025 16:30
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from a9128d0 to 9979dd2 Compare October 13, 2025 16:38
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 50bd971 to c5ef213 Compare October 13, 2025 16:38
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from 9979dd2 to 243194d Compare October 13, 2025 16:39
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch 3 times, most recently from c407759 to ce5e0d5 Compare October 13, 2025 16:51
@YoungVor YoungVor marked this pull request as ready for review October 13, 2025 16:53
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from ce5e0d5 to d5eba4c Compare October 13, 2025 17:24
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from d5eba4c to 4667fdf Compare October 14, 2025 21:34
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from 243194d to a9891a0 Compare October 15, 2025 16:25
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch 2 times, most recently from 9be1491 to cd95675 Compare October 15, 2025 21:12
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from a9891a0 to fac3ee0 Compare October 15, 2025 21:15
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from cd95675 to 9c58370 Compare October 15, 2025 21:15
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from fac3ee0 to 5b33a7d Compare October 21, 2025 18:30
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 9c58370 to 772109a Compare October 21, 2025 18:30
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from 5b33a7d to b7be090 Compare October 22, 2025 20:34
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 772109a to 547fb7e Compare October 22, 2025 20:34
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from b7be090 to 2ec97bf Compare October 23, 2025 19:36
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 547fb7e to 53a799d Compare October 23, 2025 19:37
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from 2ec97bf to 651aaf1 Compare October 23, 2025 19:37
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 53a799d to f01dc4f Compare October 23, 2025 19:37
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from 651aaf1 to b4d4f04 Compare October 23, 2025 19:41
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from f01dc4f to 260f2bc Compare October 23, 2025 19:41
@YoungVor YoungVor added the publish Publish assets label Oct 24, 2025
@YoungVor YoungVor closed this Oct 24, 2025
@YoungVor YoungVor reopened this Oct 24, 2025
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from b4d4f04 to 7e644ae Compare October 24, 2025 22:44
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 260f2bc to 9a281cd Compare October 24, 2025 22:45
@YoungVor YoungVor force-pushed the 10-08-feat_add_pdf_parsing_to_openrouter branch from 7e644ae to 6c4fcee Compare October 24, 2025 23:25
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 9a281cd to 289ea48 Compare October 24, 2025 23:25
Base automatically changed from 10-08-feat_add_pdf_parsing_to_openrouter to main October 24, 2025 23:32
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 289ea48 to eeb1d1d Compare October 24, 2025 23:48
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from eeb1d1d to d160b20 Compare November 20, 2025 00:20
@YoungVor YoungVor changed the base branch from main to 10-30-make_timeouts_configurable_in_semantic_llm_operations November 20, 2025 00:20
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from d160b20 to 800af67 Compare November 20, 2025 00:41
@YoungVor YoungVor changed the base branch from 10-30-make_timeouts_configurable_in_semantic_llm_operations to main November 20, 2025 00:41
@YoungVor YoungVor force-pushed the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch from 800af67 to 0f74568 Compare November 20, 2025 00:41
@YoungVor YoungVor merged commit 7c7454a into main Nov 20, 2025
14 checks passed
@YoungVor YoungVor deleted the 10-12-feat_tweak_pdf_parser_for_corner_cases_and_add_120s_demo branch November 20, 2025 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

publish Publish assets

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants