-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amazon daily deals page not loading completely #21
Comments
So i looked at an old issue where you suggested a timeout script and implemented it , but now the after loading for about 30 second the script is giving error, here is my updated code ( basically i am giving the id of last product in $selector , to wait for it to load ) ; ` $windowObj = \MTS\Factories::getDevices()->getLocalHost()->getBrowser('phantomjs')->getNewWindow(); $agentName = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 OPR/46.0.2597.57"; $myUrl = "https://www.amazon.com/gp/goldbox/"; $timeout = 30; //in seconds $tTime = time() + $timeout;
} if ($pageReady === false) {
}` and this is the message which came after script ended; Fatal error: Uncaught exception 'Exception' with message 'MTS\Common\Devices\Browsers\PhantomJS::getSelectorExists>> Got result code: 0, EMsg: Failed to get selector exists. Error: Invalid Return: null, ECode: 0' in E:\xampp\htdocs\new_job\mts_browser_1\MTS\MTS\Common\Devices\Browsers\PhantomJS.php:227 Stack trace: #0 E:\xampp\htdocs\new_job\mts_browser_1\MTS\MTS\Common\Devices\Browsers\Window.php(150): MTS\Common\Devices\Browsers\PhantomJS->getSelectorExists(Object(MTS\Common\Devices\Browsers\Window), '[id=100_dealVie...') #1 E:\xampp\htdocs\new_job\mts_browser_1\mts_daily_deals.php(24): MTS\Common\Devices\Browsers\Window->getSelectorExists('[id=100_dealVie...') #2 {main} thrown in E:\xampp\htdocs\new_job\mts_browser_1\MTS\MTS\Common\Devices\Browsers\PhantomJS.php on line 227 |
Since the DOM is extended via AJAX you will need to trigger the call that extends the page. The easiest is likely to simply scroll down the page like so:
To ensure you get the entire page, find an element that only shows up once there is no more dynamic content to load. Then loop over the scroll function and test for the presence of the element you seek.
Furthermore to screen shot the entire page, you will need to scroll all the way down. Then size the browser to the size of the document and issue the screenshot.
|
@merlinthemagic here is my code; ` $milliSecs = 60000; //$browserObj->setKeepalive(true); $agentName = $_SERVER['HTTP_USER_AGENT']; $selector = "[id=navFooter]"; while( !$exists ){
} //get size of the document after your scrool loop is complete: $docDetails = $windowObj->getDocument(); $width = $docDetails["document"]["width"]; //perform a screenshot: |
Hi, You have two problems. First you are setting the scroll position to 500px again and again, you need to increment it. Second imagine how fast the while loop executes compared to how fast the AJAX content is served. You will need to wait a tiny bit to make sure the content is loaded before scrolling again. |
So for the link in above comment ,, its not even an ajax issue , because i save the DOM as an html file to see the complete page and all the page was saved , but when i screenshot , not all the page is displaying `//get the HTML of the current page: //save the window object so we can pick it up again |
so now i have tried like this , with incrementing $top , still the same, not complete screenshot: `$selector = "[id=navFooter]"; $top = 500; while(!$exists){
}` |
What is the resolution of the image you receive at the end? Also please post a var_dump of $docDetails. |
so for the above link ( https://www.amazon.com/gp/offer-listing/B01BAFWRFO/ref=dp_olp_new_mbc?ie=UTF8&condition=new ) , i fixed it by giving fixed height and width , so the code is simply the one in documentation: `$top = 0; //perform a screenshot: //very large image... |
these are parameters i got from echoing previous code; `$selector = "[id=navFooter]"; $top = 500; while(!$exists){
} //get size of the document after your scrool loop is complete: $docDetails = $windowObj->getDocument(); var_dump($docDetails); $width = $docDetails["document"]["width"]; echo " $windowObj->setSize($width, $height); //perform a screenshot: //very large image... |
So here is the link for the daily deals amazon 1st page ( https://www.amazon.com/gp/goldbox ) . i am trying to load this with the php script. but it is loading only the first eight products , the rest 24 products are not loaded. After analyzing the daily deals page ,i realized the rest of products are loaded through ajax.
Anyway i can make the complete page load ? Here is the script i am using;
`ini_set('max_execution_time', 120);
require 'MTS/MTS/EnableMTS.php';
$windowObj = \MTS\Factories::getDevices()->getLocalHost()->getBrowser('phantomjs')->getNewWindow();
$agentName = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 OPR/46.0.2597.57";
$windowObj->setUserAgent($agentName);
$myUrl = "https://www.amazon.com/gp/goldbox/";
$windowObj->setUrl($myUrl);
// tried to save page to see if all page is loaded or not , seems like did not load
//$domData = $windowObj->getDom();
//file_put_contents("daily_deals.html", $domData);
//perform a screenshot:
$screenshotData = $windowObj->screenshot();
//the image is just showing some portion of the screen, how can i capture complete page ?
echo
<img src="data:image/png;base64,' . base64_encode($screenshotData) . '" />;
The text was updated successfully, but these errors were encountered: