Skip to content

How to verify source content? (to eliminate the scenario of being throttled) #1010

@cthulhubuddha

Description

@cthulhubuddha

I am successfully pulling the content of a source page using SmartScraperGraph and issuing prompts to a local LLM. However, the results of my prompts to the local LLM indicate the content may be being throttled during the scrape. For example, in a browser, the source page has 200 listings, but using SmartScraperGraph with multiple different prompts to the LLM show the LLM only sees 10.

How can I dump the entire contents of what was scraped to a file so I can compare to the page's source code?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions