confused and dazed Posted April 22, 2015 Author Share Posted April 22, 2015 submit a post request that contains the data from the login form with the correct names, and get the cookies that the server sends back. I guess I have some investigating ahead of me. I understand what that is on the surface but have no idea how to accomplish it. I will be back at some point either thanking or asking more questions. Until then....... May all your code be syntax error free! Link to comment Share on other sites More sharing options...
confused and dazed Posted April 23, 2015 Author Share Posted April 23, 2015 So This absolutely worked. I was able to login and grab the info i needed. What I like about W3S forum (and is the reason why this is the only one I go to) is because the answers are not given. Most if not all of you guys try to teach and allow folks to do the work on their own. THANKS AGAIN!! Link to comment Share on other sites More sharing options...
confused and dazed Posted April 23, 2015 Author Share Posted April 23, 2015 OK so once I started digging into the data the user page provided I realized the content I want is buried in bootstrap table data. How do I extract "bingo" from this table data? Â {"status":"X","some_id":"112","getting_close":"ER","I_want_this":"bingo"} Link to comment Share on other sites More sharing options...
justsomeguy Posted April 23, 2015 Share Posted April 23, 2015 Once you convert that to an object the property name is I_want_this. Link to comment Share on other sites More sharing options...
confused and dazed Posted April 23, 2015 Author Share Posted April 23, 2015 (edited) I tried that but its not working... the table is buried in a script tag  $dom = new DOMDocument(); @$dom->loadHTML($stuff); foreach($dom->getElementsByTagName('script') as $this1) { $try1 = $this1->*****dont know what to put here******("I_want_this"); $echo $try1; } Edited April 24, 2015 by confused and dazed Link to comment Share on other sites More sharing options...
confused and dazed Posted April 24, 2015 Author Share Posted April 24, 2015 Tried many things. Im stuck. Link to comment Share on other sites More sharing options...
Ingolme Posted April 24, 2015 Share Posted April 24, 2015 You can't use DOMDocument to extract data from JSON. You need to use the json_decode() function for that. Â First, extract the content from the script element and store it as a string, then pass that string to json_decode to get an object or associative array. After that, you can obtain the value you need from the "I_want_this" property of the object. Link to comment Share on other sites More sharing options...
confused and dazed Posted April 24, 2015 Author Share Posted April 24, 2015 (edited) That's part of the problem - I am not able to extract the data. what call should I use to extract the content from the script element and save it as a string? Edited April 24, 2015 by confused and dazed Link to comment Share on other sites More sharing options...
Ingolme Posted April 24, 2015 Share Posted April 24, 2015 I'm making assumptions as to how the HTML is structured, but it should be something like this: foreach($dom->getElementsByTagName('script') as $script) {// First child is a text node with the JSON string in it $JSON = $this1->firstChild->nodeValue; $data = json_decode($JSON, true); echo $data['I_want_this'];} If the $JSON variable does not actually contain an isolated JSON string, then you'll have to use ordinary string manipulation methods to extract it from there. The main idea is that once you've pulled the string out of the document, you don't need to use DOMDocument methods anymore, just treat it like any other string. Link to comment Share on other sites More sharing options...
confused and dazed Posted April 24, 2015 Author Share Posted April 24, 2015 As I look back through the 3000 plus lines of source code in between the script tags - a few things stick out 1. There are MANY "function()"s before this particular bootstrap begins 2. There are several "_.extend(" methods before this particular bootstrap begins - but I'm guessing this does not mean much 3. The bootstrap is not inside any function - it stands alone 4. the bootstrap is the last set of data between the script tags Link to comment Share on other sites More sharing options...
Ingolme Posted April 24, 2015 Share Posted April 24, 2015 If you can show me the page, perhaps I can find a rule that can be used to extract specifically what you're looking for. Link to comment Share on other sites More sharing options...
confused and dazed Posted April 24, 2015 Author Share Posted April 24, 2015 (edited) I can do that - but I would like to try and get it on my own... are there any partiicular rules you can think of for me to try? Here is a very simplistic version of what is between the script tags  <script> // comments // comments (function(){ A WHOLE LOT OF STUFF}).call(this); // comments // comments (function(){ ANOTHER HUGE FUNCTION WITH A WHOLE LOT OF STUFF}).call(this); // comments // comments Backbone.STUFF = (function() STUFF); // comments // comments Wreqr.Commands = (function() STUFF); // comments // comments var x1 = 'something1', x2 = 'something2', x3 = 'something3', x4 = 'something4', x5 = 'something5', x6 = 'something6'; bootstrapthisstuff = {"status":"X","some_id":"112","getting_close":"ER","I_want_this":"bingo"} </script> This is basically what is between the scripts Edited April 24, 2015 by confused and dazed Link to comment Share on other sites More sharing options...
justsomeguy Posted April 24, 2015 Share Posted April 24, 2015 You'll need to write a regular expression to find the line you're looking for and extract it. The variable name on the line could be part of the pattern to look for, and it sounds like you need everything on that one line (assuming it's all on one line). Link to comment Share on other sites More sharing options...
confused and dazed Posted April 24, 2015 Author Share Posted April 24, 2015 (edited) The bootstrap is all on one line - one VERY LONG LINE. I'm not looking for any of the var values - I'm looking to extract "bingo" from the bootstrap bootstrapthisstuff = {"status":"X","some_id":"112","getting_close":"ER","I_want_this":"bingo"} Â Also I put in a counter{ $i=1......$i++} echo $i; and it lists the correct amount of Script tags in the source data which is 46. So I know its getting into the script tags Edited April 24, 2015 by confused and dazed Link to comment Share on other sites More sharing options...
justsomeguy Posted April 24, 2015 Share Posted April 24, 2015 You need to use a regular expression to extract the piece you're looking for. Extract the entire JSON string, decode it using json_decode, then get the item you're looking for. Link to comment Share on other sites More sharing options...
Ingolme Posted April 24, 2015 Share Posted April 24, 2015 I'll mark out the steps: 1. Pull the text out of the <script> node with DOMDocument 2. Use a regular expression that will match everything between "bootstrapthisstuff = " and the following line break 3. Decode the JSON 4. Take the value you need out of the resulting object. Link to comment Share on other sites More sharing options...
justsomeguy Posted April 24, 2015 Share Posted April 24, 2015 If there's not a line break like you showed, then match "bootstrapthisstuff = { ... }" Link to comment Share on other sites More sharing options...
confused and dazed Posted April 25, 2015 Author Share Posted April 25, 2015 (edited) foreach($dom->getElementsByTagName('script') as $links1) { preg_match('bootstrapthisstuff = { ... }', $links1); $data = json_decode($links1, true); echo $data['I_want_this']; } This brings nothing.... Edited April 25, 2015 by confused and dazed Link to comment Share on other sites More sharing options...
Ingolme Posted April 25, 2015 Share Posted April 25, 2015 You need to construct a proper regular expression for that. You'll need to learn regular expressions. Link to comment Share on other sites More sharing options...
confused and dazed Posted April 25, 2015 Author Share Posted April 25, 2015 no matter what I try it appears step 2 is not considering $links1 as a string so therefore I don't even get a 0 when this code executes. Â 1. foreach($dom->getElementsByTagName('script') as $links1) { 2. echo preg_match('/no_matter_what_I_try/', $links1); 3. $data = json_decode($links1, true); 4. echo $data['I_want_this']; Link to comment Share on other sites More sharing options...
Ingolme Posted April 25, 2015 Share Posted April 25, 2015 $links1 is a DOMDocument node, not a string, you need to get the string out of the node first. Link to comment Share on other sites More sharing options...
confused and dazed Posted April 26, 2015 Author Share Posted April 26, 2015 (edited) OK so I was able to sort out the regular expression call and was able to get the bootstrap table into a string. I was able to echo the string out so I know I captured it. The string looks like this {the table} and is completely enclosed in curly brackets. Â Now the next problem, I am not able to get "bingo" from "I_want_this". The bootstrap table is very complex it follows this sort of structure {"main list":{level1":1,"stuff1":[{"a1":1,"a2":"2","a3":"3","somelevel":[{"getstricky":[{"keepsgoing":"X", and so forth.......} the table is quite complex. Also there appears to be more than one "I_want_this" within the table with different values. bingo is one, touchdown is another and so on. Â How would I capture all of the individual "I_want_this" values from the table? Edited April 26, 2015 by confused and dazed Link to comment Share on other sites More sharing options...
Ingolme Posted April 26, 2015 Share Posted April 26, 2015 Did you manage to decode the JSON? Â If you have done that, you can use var_dump() on the resulting object to see its structure. Show the structure here and I can give you an idea of what procedure needs to be followed to pull data from it. Link to comment Share on other sites More sharing options...
confused and dazed Posted April 26, 2015 Author Share Posted April 26, 2015 I did the var_dump and it put this string(48723) " in front of what I already wrote above which was this {"main list":{level1":1,"stuff1":[{"a1":1,"a2":"2","a3":"3","somelevel":[{"getstricky":[{"keepsgoing":"X", and so forth.......} any suggestions? Link to comment Share on other sites More sharing options...
Ingolme Posted April 26, 2015 Share Posted April 26, 2015 You have to run json_decode() on that first. What you should have is an associative array or object, not a string. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now