paulm Posted August 7, 2016 Share Posted August 7, 2016 (edited) Viewing Page Source, and downloading url with wget both show this Google Trends page without the "Stories Trending Now" list, that is visible in Developer Console: md-list.md-list-block.ng-scope Searching on this topic points to JavaScript hiding elements of DOM. So how to download this section of the page? Thanks in advance for guidance. Edited August 7, 2016 by paulm 1 Link to comment Share on other sites More sharing options...
paulm Posted August 7, 2016 Author Share Posted August 7, 2016 (edited) Just happened on phantomjs, which seems the right direction to get html obfuscated by JavaScript. Phantomjs script is working with w3schools.invisionzone.com as url, but not with Google Trends url. I'm showing the output of the troubleshooting script here; thanks for help. "headers": [ { "name": "User-Agent", "value": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36" }, { "name": "Accept", "value": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" } ], "id": 1, "method": "GET", "time": "2016-08-07T14:02:34.247Z", "url": "http://www.google.com/trends/home/m/US" }Request { "headers": [ { "name": "User-Agent", "value": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36" }, { "name": "Accept", "value": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" } ], "id": 2, "method": "GET", "time": "2016-08-07T14:02:34.473Z", "url": "https://www.google.com/trends/home/m/US" }Unable to access network Edited August 7, 2016 by paulm Link to comment Share on other sites More sharing options...
Ingolme Posted August 7, 2016 Share Posted August 7, 2016 Any content generated by Javascript will not be in the page's source code, but will be in the DOM. The source code only shows exactly the data that was received from the server. Link to comment Share on other sites More sharing options...
paulm Posted August 7, 2016 Author Share Posted August 7, 2016 Right. Link to comment Share on other sites More sharing options...
paulm Posted August 7, 2016 Author Share Posted August 7, 2016 I got phantomjs working to retrieve html from https urls by: path-to/phantom.js --ssl-protocol=any /path-to/script.js in case it helps someone. Link to comment Share on other sites More sharing options...
justsomeguy Posted August 8, 2016 Share Posted August 8, 2016 Just to be clear, the HTML is not being hidden or obfuscated, it is being created by Javascript. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now