Question regarding correct json body for actual results: Cant seem to get any meaningful and reliable results! #1421
Replies: 1 comment 2 replies
-
Let's make your config easier for the next person to read:
From looking at the site, extracting the mine data directly with an LLM will be difficult. A more reliable approach would be:
On the operations/projects pages, the data is basically ready for crawling. You can also inject JavaScript in your browser to test:
Replace the In this case, an LLM is probably overkill. Still, if you want to push it,
Important for this particular page:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The only thing I dont seem to get to work is a proper scrape of the website and its information. Do you know any videos or guidance on how to properly get information from a side.
I tried a deepcrawl strategy but its just not getting it and it takes millions of input tokens just for basically no output whatsoever... the ai model is then prompted with the entire website basically and should extract information from that which obviously does not work to the precision that we would like to achieve.
{ "urls": [ "https://www.kinross.com/" ], "browser_config": { "type": "BrowserConfig", "params": { "headless": true } }, "crawler_config": { "type": "CrawlerRunConfig", "params": { "deep_crawl_strategy": { "type": "BFSDeepCrawlStrategy", "params": { "max_depth": 2, "max_pages": 20, "include_external": false } }, "extraction_strategy": { "type": "LLMExtractionStrategy", "params": { "llm_config": { "type": "LLMConfig", "params": { "provider": "gemini/gemini-2.5-flash", "api_token": "env:GOOGLE_API_KEY" } }, "instruction": "Extract mine names and locations. Return JSON only.", "schema": { "type": "dict", "value": { "type": "object", "properties": { "mines": { "type": "array", "items": { "type": "object", "properties": { "name": { "type": "string" }, "location": { "type": "string" } } } } } } } } } } } }
Beta Was this translation helpful? Give feedback.
All reactions