Pass Markdown directly instead of HTML #621
Unanswered
DidacGit
asked this question in
Forums - Q&A
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
Is there a way for to directly pass markdown instead of HTML to the request?
What I want to do is parse a PDF -> MD -> LLM Request.
Here's how I tried to do it (following Crawl4AI Docs - Local Files):
But the result is kinda weird:
result.cleaned_html: I see it added some markup elements (e.g.<p>)result.html: It is the exact same asmd_textresult.markdownandresult.markdown_v2.raw_markdown: It is likemd_textbut in a single line, for some reason.In any case, the
result.extracted_contentseems to be correct! But I'm still asking just in case I'm doing something wrong. Maybe I should setLLMExtractionStrategy(input_format="")tohtmlinstead of the defaultmarkdown.Thanks a lot in advance!
Beta Was this translation helpful? Give feedback.
All reactions