Jump to content

substr function


Fmdpa

Recommended Posts

How do I execute the substr() function on an external document? For example, I am creating a page that lists articles. Each article mentioned on the page would first have an dynamic h2 title (by fetching the <title> of a specified file...e.g. article1.php), and underneath that would be a paragraph of the first 30 words in the body of the specified file. Then I want to tag a class to both elements, so they could be formatted via CSS. How would I do these three things?

Link to comment
Share on other sites

If the page is a flat file you'll need to read it into memory first. If it's in a database you'll need to pull the row it's in. You could use a regex pattern to locate and extract the title. Do the same with the paragraph you want. Explode the paragraph along the spaces and grab the first 30 elements, then glue them back together. This function will do that, and add '...' to the end:

function shorten_string($string, $num_words){   $retval = $string;    $array = explode(" ", $string);if (count($array)<=$wordsreturned){   $retval = $string;}else{   array_splice($array, $wordsreturned);   $retval = implode(" ", $array)." ...";}return $retval;}

(Not my function, can't recall where I found this. It's a little clumsy, but it works.)

Link to comment
Share on other sites

Here some functions which can help you:http://php.net/manual/en/function.file-get-contents.php (to get the content from the file)http://www.php.net/manual/en/function.preg-match-all.php (to extract the title tag and the text)http://www.php.net/manual/en/function.strip-tags.php (to clean the text from any HTML or JS)http://www.php.net/manual/en/function.substr.php (Return a part of the text)

Link to comment
Share on other sites

I tried using the file_get_contents function, but it did almost the same thing as include. How could I just extract text from the body section? From there, I could run it through the substr function.

Link to comment
Share on other sites

It depends how the pages are set up. If it's a static HTML page or if you want to read the PHP code, then you can use file_get_contents. If it's a dynamic PHP page and you want the output of the page, you need to use output buffering to include the file and then get the output of the script from the buffer. That's how you get the content of the page, you can process it however you want after that.

Link to comment
Share on other sites

The area I want to fetch is static HTML (in fact, text from in between <p> tags to be specific, would be ideal, but all text from the body would suffice). I tried file_get_contents, but it returned the entire page (except php, of course), bg and all. The code I used was:

file_get_contents('page.html');

Link to comment
Share on other sites

The area I want to fetch is static HTML (in fact, text from in between <p> tags to be specific, would be ideal, but all text from the body would suffice).
Try this:
$str = file_get_contents('page.html');preg_match("|<p>(.+)</p>|si", $str, $match);

The contents of data matched between the <p> tags will be in the $match variable, as in $match[0], $match[1], etc.

Link to comment
Share on other sites

Note that this technique is really only useful if you know which <p> element you are looking for. If the index will always be the same (like, you always want the first <p> element) then this will work fine.Depending on what you're after, you might do better to construct a DOMDocument object from the file, rather than using file_get_contents. This would let you use DOM techniques to find exactly the paragraph you want, based on its id or some other attribute.Here are some techniques you can use. (This is not tested. It's mostly off the top of my head.)

// load the file$file = 'something.html';$doc = new DOMDocument();$doc->loadHTMLFile($file);// if you have an id:$myPara = $doc->getElementById('something');// $myPara->nodeValue returns the content of the <p> element whose id is 'something'// if you're looking for a <p> with a certain attribute:$allParas = $doc->getElementsByTagName('p');foreach($allParas as $myPara) {   if ($myPara->getAttribute('class') == 'blue') {	  break;   }}// $myPara->nodeValue returns the content of the first <p> element whose class attribute is 'blue'

More here: http://us3.php.net/manual/en/class.domdocument.php

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...