Skip to content

What can i do when i have operation TimeOut ? #679

@karimWorldSpace

Description

@karimWorldSpace

Hi community,

I'm working on a personal project where I need to retrieve the title and HTML content of a webpage (a simple task).

Sometimes, the URL I visit has protections like cookies, but the HTML content is already fully loaded, so I don’t actually care about the cookie. All the information I need is in the HTML.

Here’s my problem:

  1. When I try to evaluate the title of the page, I often get a timeout error.
  2. To handle this, I retry the process after the first failure, but I can’t do more than that ?.
  3. What’s confusing is that if I manually check the title in the browser’s console, it works perfectly. However, when I try to retrieve the title programmatically in PHP, it doesn’t work.
  4. Does anyone know why this might happen or how I can fix it?

Thanks for your help!

Here is my code

` $urls = $urlScrapedByKeyWordRepository->findBy(['isUsedForGeneration' => false]);
shuffle($urls);
$urls = array_slice($urls, 0, 2);

    if ($urls) {
        /** @var UrlScrapedByKeyword[] $urls */
        foreach ($urls as $key => $url) {
            $urlScrapped = ltrim($url->getUrl(), './');
           // $urlScrapped =  $urlScrapped;

            $browser = $this->createBrowser();
            $page = $browser->createPage();
            $html = false;

            try {
                $page->navigate($urlScrapped, ['strict'])->waitForNavigation(Page::INTERACTIVE_TIME, 6000);
                $page->evaluate("console.log('document.title')");

               // -> here where my code crash so i catch the error below
                $pageTitle = $page->evaluate('document.title')->waitForResponse()->getReturnValue();

                if ($pageTitle == 'Before you continue')
                {
                    $this->AcceptGoogleCookies($page);
                    $pageTitle = $page->evaluate('document.title')->waitForResponse()->getReturnValue();
                } 

                echo($pageTitle.' from normal way');
                $pageContent = $page->getHtml(2500);
                sleep(1);
                if ($pageContent) echo('content OK');

                if ($pageTitle == 'Before you continue') $this->AcceptGoogleCookies($page);
            } catch (OperationTimedOut $e) {
                // Here in the console of the navigator, i can see this operation work correctly
               $page->evaluate("console.log(document.title)");

                // !!----catch the error and retry to evaluate title but again crash ----!!
                $pageTitle = $page->evaluate('document.title')->getReturnValue(); 

                if ($pageTitle == 'Before your continue') $this->AcceptGoogleCookies($page);

                echo $pageTitle.' from error';
                $pageContent = $page->getHtml(2500);
                sleep(1);
                if ($pageContent) echo('content OK from error');
            } catch (NavigationExpired $e) {
                echo "Erreur de NavigationExpired lors de l'évaluation du titre : $pageTitle</br>";
            }
        }`

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions