Thank you! Your feedback has been delivered
Thank you! Your feedback has been sent

not able to lazy load in phantomjs

I'm trying to scrape some information(using phantomjs) from the link (http://www.myntra.com/women-sarees?nav_id=606) that involves lazy loading. Below is my code snippet for this:

window.setInterval(function() {




      // Checks if there is a div with class=".has-more-items" 
      // (not sure if this is the best way of doing it)
      var count = page.evaluate(function() {

try{


 return document.getElementsByClassName('more-products-loading-indicator')[0].style.display;
}catch(e){return e.message;}
    });

      if((count == 'none')&&(k < 4)) { // Didn't find
console.log('count none' + k);
k=k+1;
page.evaluate(function() {
          // Scrolls to the bottom of page

         console.log('hey');
         //window.scrollBy(0,500);
          window.document.body.scrollTop = document.body.scrollHeight;

        });
        page.render('myn'+k+'.png');
      }
      else { // Found
        //Do what you want
//console.log('len123');
console.log('count block');
     page.evaluate(function() {
          // Scrolls to the bottom of page

        });
            try {
       var links = page.evaluate(function() {
        return [].map.call(document.querySelectorAll('a.clearfix'), function(link) {
            return 'http://www.myntra.com'+link.getAttribute('href');
        });                });
            } catch (e) {

           console.log(e.message); return [];
            }

console.log(links.join(','));
var result = links.join(',');
console.log(links.length);
page.render('myntra.png');
        phantom.exit();



      }

  }, 5000); // Number o ms to wait between scrolls

But I'm getting only first six rows scraped. Apparently, the page is not loaded after it is scrolled down.

User Gravatar

test123

Posted Nov 9 2013 0:54 UTC

$10


  • casperjs
    phantomjs
  • 3541 Views

3 Replies


Actually you don't seem to be scrolling no where..

page.evaluate(function() { // Scrolls to the bottom of page

    });

is an empty function. why should it scroll?

here are some pointers on how to scroll: http://stackoverflow.com/questions/11715646/scroll-automatically-to-the-bottom-of-the-page basically its

window.scrollTo(0,document.body.scrollHeight);

you should experiment on what works for your scenario since your link doesn't scroll to the real end when this command is invoked but only to the next load point. so this should be in a loop, until the real page end is achieved: e.g when the number of li in the product list === to the number in heading "XXX products found"

good luck

User Gravatar

alonisser

Posted Nov 9 2013 12:36 UTC

window.document.body.scrollTop = document.body.scrollHeight; won't this do the thing?? It's in the if part and the initial state of count is none. So I guess it should scroll atleast for once.

User Gravatar

test123

Posted Nov 10 2013 7:11 UTC

Note that you don't initialize k. for some languages that might work. but not for js. open your console and type: k = k+1 (or k++) and You'll get a ReferenceError. so I'm not sure the loop is even running. what probably happens is that it goes straight to the else clause (Since you specify an "And" && and not an "Or" ||) where there isn't any scrolling down (since the function is empty as I explained in the previous answer. ) and then some rows are scraped and that it. k is never called. the same rows are scraped again etc.

You should also note that a simple while loop (or a recrusive one where the if clause calls the calling function if there is still scrolling to do) would probably be much more helpful to you then this setInterval , which is probably an async nightmare to handle and very unefficient solution (It could also timeout before the scraping is done..)

And I really think you should move to using casper.js, a solution built on top of phantomjs and which is much easier to program with, at least for your kind of scraping work.

User Gravatar

alonisser

Posted Nov 10 2013 13:28 UTC

Add a reply

By posting a reply on CodersClan you agree to our Terms & Conditions