Issue rendering images #511
Replies: 1 comment
-
My General Response to the QuestionTo figure out the JavaScript needed for specific scenarios, it comes down to experience, trial, and understanding web development. Familiarity with standard design patterns, especially for common use cases like e-commerce, helps anticipate how websites are structured. Advanced web development techniques, such as dynamic and lazy loading, are also crucial because they influence how data is loaded. Knowing these allows me to craft JavaScript to ensure all data is fully loaded in the browser. My approach is to first browse the website in a normal browser, use developer tools, and experiment in the console. This provides a playground to test JavaScript and confirm it loads the necessary data. Once everything works, I move the tested JavaScript into my crawler library for use, avoiding trial and error in Python. Chrome developer tools make this process much faster and more efficient. Our Library Plan to Use LLMs and CommunityFor Crawl4AI, we plan to fine-tune small language models next year to assist with data extraction challenges. These models will analyze the data you need and the problems you're facing, then generate the JavaScript code required to fix them. They’ll work with already crawled HTML and, in some cases, page images. Instead of extracting all the data (which can be slow and token-heavy), the models will provide targeted JS snippets to solve your specific issues. We also plan to build a community-driven hub where users can contribute JavaScript snippets for popular websites. This shared library will contain pre-built code snippets to help users with common crawling scenarios. You’ll be able to load these directly into your libraries, making crawling more efficient and collaborative. This combination of LLMs and community contributions is a key part of our roadmap for next year, and I believe it will significantly improve how we handle crawling challenges. I hope this helps clarify our vision! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey @unclecode, when using crawl4ai to scrape few sites (ex. Lululemon) I'm not able to extract all the images from theroduct site. I noticed that these images are dynamically rendered. I tried using
js_codeparameter to render all the images related to the product but not all of them are being rendered.Can you please explain to me, how do you go about figuring out the
js_codeto render images.Following is the code I'm currently using
When checking the output terminal not all of the images are present. could you please help me with this issue ?
Thanks in advance !!
Beta Was this translation helpful? Give feedback.
All reactions