Jump to content

File-Indexing Script


Jesdisciple
 Share

Recommended Posts

I have a kinda complex script for building my site's menu that I obviously didn't test early enough. So any improvements are certainly welcome.EDIT: Output buffering no longer relates to this script.My current problem is with two sections of code:

  • The data are stored in the site's files and indicate
    1. whether those files should be indexed, and
    2. the TITLE attribute of a file's corresponding menu link.

    For example, a page which should be linked to by the menu begins like this:

    if(isset($describe)){	if($describe == __FILE__){		$descriptor = 'indexable: 1; title: blah blah blah;';	}	exit();}

    ($describe indicates which file is being indexed. If another one accidentally gets involved it will duck out politely.) A file which doesn't belong on the menu only has 'indexable: 0;'.[*]These functions extract those data:

    function getInfo($file){	global $dir;	$describe = $file;	include "$dir/$file"; // $dir is the internal absolute path to my site's root directory.	$array['indexable'] = intval(getInfo_helper($descriptor, 'indexable')) ? TRUE : FALSE;	$array['title'] = getInfo_helper($descriptor, 'title');	return $array;}function getInfo_helper($descriptor, $name){	$name = "$name: ";	$colon = strpos($descriptor, $name);	if($colon !== FALSE){		$colon += strlen($name);		$semi = strpos($descriptor, ';', $colon);		return substr($descriptor, $colon, $semi);	}else{		return;	}}?>

My hangup is that the include in getInfo() never works... Why? (Yes, the path is absolute; this is a Linux system.)

Warning: include(/home/chris/www/projects/110mb/blogs) [function.include]: failed to open stream: No such file or directory in /home/chris/www/projects/110mb/includes/file_list.php on line 39Warning: include() [function.include]: Failed opening '/home/chris/www/projects/110mb/blogs' for inclusion (include_path='.:/opt/lampp/lib/php') in /home/chris/www/projects/110mb/includes/file_list.php on line 39
Thanks!
Link to comment
Share on other sites

I just tried it, and it was the same as displayed in the error message. To make sure, I copied the first instance into my file browser and it opened the intended directory.Would it give that error if I were trying to include the directory path's index file? *goes to check* EDIT: I edited this into getInfo(), and now all the errors are gone... But the script still breaks at the call to include (and I have error_reporting(E_ALL | E_STRICT); at the top):

	$path = "$dir/$file";	$describe = $path . (is_dir($path) ? '/index.php' : '');	include $describe;

EDIT2: Of course! My call to exit() is doing its job... So I need to figure out how to exit the included file but not all of PHP.EDIT3: I've verified that as my entire problem. Thanks for getting my gears unstuck!

Link to comment
Share on other sites

Wow, you made an entire 21-line file (of inclusion + output buffering) completely unnecessary. Thanks!Since this is the thread I made for this project, I'll just revive it for this next problem unless anyone has an objection.My menu is just a bunch of nested ULs and LIs animated by CSS and JS, and I'm trying to automatically generate it. But I screwed the second-level menus up in a recent edit, and now each is named "Home" with only one item "Home," and every "Home" link points to the site's homepage.I was just debugging my page after writing that last paragraph, and the entire menu is now messed up. I guess I need to learn how to debug... Anyway, here's the menu without the CSS+JS (linebreaks are added):

<ul><ul><li><a href="/chris/projects/110mb/index.php" title="">Home</a></li></ul><li><li><a href="/chris/projects/110mb/index.php" title="">Home</a></li>1</li><ul><li><a href="/chris/projects/110mb/index.php" title="snippets, applications, and tutorials;">Home</a></li></ul><li><li><a href="/chris/projects/110mb/index.php" title="snippets, applications, and tutorials;">Home</a></li>1</li></ul>

And here's the entire file_list.php in all its complexity... (Note that my PHP must manage many things that Apache normally does because I refuse to pay for .htaccess on my deployment server, but those parts of my sites are stable. For example, any time you see included = TRUE; or 'includes/index.php' I am making an included file only accessible from within PHP.)

<?phperror_reporting(E_ALL | E_STRICT);if(isset($describe)){	if($describe == __FILE__){		return 'indexable: 0;';	}	return;}//require $inner_root . '/includes/index.php';/*if(!headers_sent()){header('Content-type:text/plain');}*/if(!function_exists('getInfo')){	function getInfo_helper($descriptor, $name){		$name = "$name: ";		$colon = strpos($descriptor, $name);		if($colon !== FALSE){			$colon += strlen($name);			$semi = strpos($descriptor, ';', $colon);			return substr($descriptor, $colon, $semi);		}else{			return;		}	}	function getInfo($describe){		global $dir;		$included = TRUE;		$describe .= (is_dir($describe) ? '/index.php' : '');		$descriptor = include $describe;		$array['indexable'] = intval(getInfo_helper($descriptor, 'indexable')) ? TRUE : FALSE;		$array['title'] = getInfo_helper($descriptor, 'title');		return $array;	}}$level = isset($level) ? $level + 1 : 0;$dir = isset($dir) ? $dir : $_COOKIE['inner_root'];$files = scandir($dir);echo "<ul>";foreach($files as $file){	$dot = strrpos($file, '.');	$path = "$dir/$file";	$name = ucwords(str_replace('_', ' ', is_file($path) && $dot !== false ? substr($file, 0, $dot) : $file));	if($file == 'index.php'){		$name = 'Home';	}	if($file != '.' && $file != '..'){		$info = getInfo($path);				if($info['indexable']){			switch($file){				case 'includes':					break;				default:					$li = "<a href=\"$_COOKIE[outer_root]/$file\" title=\"$info[title]\">$name</a>";					if(is_dir($path)){						$_dir = $dir;						$dir .= "/$file";						$included = TRUE;						$li .= include __FILE__;						$dir = $_dir;					}					$li = "<li>$li</li>";					echo $li;			}		}	}}echo "</ul>";$level--;?>

Link to comment
Share on other sites

I think you messed up the paste, you pasted the entire file in the middle of itself, but I got the idea. It looks like you're scanning through the list of files and looping through the list, and using like a recursive include to process each file. You can simplify that if you use a recursive function instead. It's confusing to follow that code with the recursive include, I'm not real sure what that's going to do without running it and having it print a bunch of debug info. But you should be able to simplify that with a recursive function to scan through a directory and include each file to see if it's indexable and what the title is, and run the function again for subdirectories.

<?phpfunction get_file_list($dir){  $retval = '<ul>';  $files = scandir($dir);  foreach($files as $file)  {	# exclude the current directory and the parent directory	if($file != '.' && $file != '..')	{	  $retval .= '<li>';	  $path = $dir . DIRECTORY_SEPARATOR . $file;	  if (is_dir($path))	  {		# if this is a directory, get the list for the directory and append it to everything else		# print the directory name, or whatever else, as the menu header		$retval .= $file;		$retval .= get_file_list($path);	  }	  else	  {		# if this is a file, print the entry for the file		$dot = strrpos($file, '.');		$name = ucwords(str_replace('_', ' ', is_file($path) && $dot !== false ? substr($file, 0, $dot) : $file));		if($file == 'index.php')		{		  $name = 'Home';		}		$info = getInfo($path);		if($info['indexable'])		{		  switch($file)		  {			case 'includes':			  break;			default:			  $retval .= "<a href=\"{$_COOKIE['outer_root']}/{$file}\" title=\"{$info['title']}\">{$name}</a>";			  break;		  }		}	  }	  $retval .= '</li>';	}  }  $retval .= '</ul>';  return $retval;}function getInfo_helper($descriptor, $name){  $name = "$name: ";  $colon = strpos($descriptor, $name);  if($colon !== FALSE){	$colon += strlen($name);	$semi = strpos($descriptor, ';', $colon);	return substr($descriptor, $colon, $semi);  }else{	return;  }}function getInfo($describe){  global $dir;  $included = TRUE;  $describe .= (is_dir($describe) ? '/index.php' : '');  $descriptor = include $describe;  $array['indexable'] = intval(getInfo_helper($descriptor, 'indexable')) ? TRUE : FALSE;  $array['title'] = getInfo_helper($descriptor, 'title');  return $array;}?>

I haven't even tried to execute that so I'm not sure if there will be errors, but hopefully it gives you an idea. You would print the menu by just calling the function:echo get_file_list(dirname(__FILE__));If it sees a directory then it will call itself with the new directory, or else if it's a file then it just prints the link. I had to look this up to make sure, but a nested UL should be inside of an LI. e.g.

<ul>  <li>level 1	<ul>	  <li>subitem 1</li>	  <li>subitem 2</li>	  <li>subitem 3</li>	</ul>  </li>  <li>main item</li></ul>

That function should produce a structure like that, where "level 1" would be a folder name and the items under it would be files in that folder. It will keep going as deep as necessary if it finds more subdirectories.

Link to comment
Share on other sites

Ugh... Yes, thanks for showing me that. I've edited it now. Ever since I switched to Linux the middle mouse-button (wheel for me) means "paste." I have a habit of using the wheel to scroll, so you can imagine the conflict.I'm sorry I couldn't explain the script; I was pressed for time and I'm not very good at that anyway. But you got the gist well.Thanks for that suggestion... I hadn't even considered using a function as the main routine; I have trouble stepping back from my code to take in the big picture.Now the files at the site's root aren't regarded as indexable even though they claim to be - and some titles are missing, but I imagine that's the same bug. (BTW, the main reason I rearranged the function was to make the menu degradable so it will still work if JS doesn't - i.e. directories and files are treated essentially the same. I also notice that we style our code very differently...)

<ul><li><a href="/home/chris/www/projects/110mb/blogs" title="">Blogs</a><ul><li><a href="/home/chris/www/projects/110mb/blogs/electronic_incantations.php" title="">Electronic Incantations</a></li><li><a href="/home/chris/www/projects/110mb/blogs/index.php" title="">Home</a></li><li><a href="/home/chris/www/projects/110mb/blogs/no,_it's_not_42.php" title="my social blog;">No, It's Not 42</a></li></ul></li><li><a href="/home/chris/www/projects/110mb/code" title="snippets, applications, and tutorials;">Code</a><ul><li><a href="/home/chris/www/projects/110mb/code/index.php" title="snippets, applications, and tutorials;">Home</a></li></ul></li></ul>

<?phperror_reporting(E_ALL | E_STRICT);if(isset($describe)){	if($describe == __FILE__){		return 'indexable: 0;';	}	return;}//require $inner_root . '/includes/index.php';header('Content-type:text/plain');function getFileList($dir){	$retval = '<ul>';		$files = scandir($dir);	foreach($files as $file){		# exclude the current directory and the parent directory		if($file != '.' && $file != '..'){			$path = $dir . DIRECTORY_SEPARATOR . $file;			$info = getInfo(is_dir($path) ? $path . DIRECTORY_SEPARATOR . 'index.php' : $path);			if($info['indexable']){				$dot = strrpos($file, '.');				$name = ucwords(str_replace('_', ' ', is_file($path) && $dot !== false ? substr($file, 0, $dot) : $file));				if(is_file($path) && $file == 'index.php'){					$name = 'Home';				}				$retval .= "<li><a href=\"{$path}\" title=\"{$info['title']}\">{$name}</a>";				# if this is a directory, get the list for the directory and append it to everything else				if (is_dir($path) && $path != "{$_COOKIE['inner_root']}/includes"){					$retval .= getFileList($path);				}				$retval .= '</li>';			}		}	}		$retval .= '</ul>';		return $retval;}function getInfo_helper($descriptor, $name){	$name = "$name: ";	$colon = strpos($descriptor, $name);	if($colon !== FALSE){		$colon += strlen($name);		$semi = strpos($descriptor, ';', $colon);		return substr($descriptor, $colon, $semi);	}}function getInfo($describe){	global $dir;	$included = TRUE;	$describe .= (is_dir($describe) ? '/index.php' : '');	$descriptor = include $describe;	$array['indexable'] = intval(getInfo_helper($descriptor, 'indexable')) ? TRUE : FALSE;	$array['title'] = getInfo_helper($descriptor, 'title');	return $array;}echo getFileList($_COOKIE['inner_root']);?>

Link to comment
Share on other sites

Ooh, you're right; thanks!I figured out that my error was in how I was trying to transmit the data: I forgot to convert many of my $descriptor assignments to return statements.

Link to comment
Share on other sites

I got this problem solved; see the bottom of this post. I'll leave this up here just to record the incident.Weird update: I now see this string in the HTML source, directly before the UL (the two boxes are better seen in this page's source, and the gap on the second line is supposed to be a tab character):

PK?ñ8blogs/UT	??

It was alternating between that and a similar but shorter one, but now I can't get the other to come back. Here's my PHP: [removed] What'd I do...?EDIT: Maybe what I know I did will matter... I had just zipped my site up and uploaded it to my deployment server where my script attempted to include a hidden Linux file (starting with a dot) that I don't have permission to read. So I went back to my copy and changed this line:

		if($file != '.' && $file != '..'){

to this:

		if(strpos($file, '.') === FALSE){

When no ordinary files, only directories, were displayed, I undid my edit in Gedit, my text processor. I refreshed the page and observed the above behavior.EDIT2: Something was wrong with the characters my undo put on that line. I changed them and the problem went away. Sorry I bumped this topic...

Link to comment
Share on other sites

I apparently got a false negative at the end of that post. As a result, I submitted a possibly false bug report. Here's my page's HTML source:

<?phperror_reporting(E_ALL | E_STRICT);$descriptor = 'indexable: 0;';$current_file = __FILE__;switch($_SERVER['HTTP_HOST']) {	case 'localhost':		define('OUTER_ROOT', '/chris/projects/110mb');		define('INNER_ROOT', '/home/chris/www/projects/110mb');		break;	case "www.jesdisciple.110mb.com": case "jesdisciple.110mb.com": case "www.jesdisciple.co.cc": case "jesdisciple.co.cc": default:		define('OUTER_ROOT', '');		define('INNER_ROOT', '/www/110mb.com/j/e/s/d/i/s/c/i/jesdisciple/htdocs');		break;}//require INNER_ROOT . '/includes/index.php';//header('Content-type:text/plain');function getFileList($dir){	$retval = '<ul>';		$files = scandir(INNER_ROOT . $dir);	for($i = 0; $i < count($files); $i++){		if($files[$i] == 'index.php' && $i != 0){			$file = $files[$i];//copy it			array_splice($files, $i, 1);//remove it			array_splice($files, 0, 0, $file);//prefix the array with it		}		if(is_dir($files[$i])){			$dirs[] = $files[$i];//copy it			array_splice($files, $i, 1);//remove it		}	}	array_splice($files, -1, 0, $dirs);	foreach($files as $file){		# exclude the current directory and the parent directory		if($file != '.' && $file != '..'){			$path = ($dir == DIRECTORY_SEPARATOR ? $dir : $dir . DIRECTORY_SEPARATOR) . $file;			$info = getInfo(INNER_ROOT . (is_dir($path) ? $path . DIRECTORY_SEPARATOR . 'index.php' : $path));			if($info['indexable']){				$dot = strrpos($file, '.');				$name = ucwords(str_replace('_', ' ', is_file(INNER_ROOT . $path) && $dot !== false ? substr($file, 0, $dot) : $file));				if(is_file(INNER_ROOT . $path) && $file == 'index.php'){					$name = 'Home';				}				$constants = get_defined_constants();				$retval .= "<li><a href=\"{$constants['OUTER_ROOT']}{$path}\" title=\"{$info['title']}\">{$name}</a>";				# if this is a directory, get the list for the directory and append it to everything else				if (is_dir(INNER_ROOT . $path) && $file != '/includes'){					$retval .= getFileList($path);				}				$retval .= '</li>';			}		}	}		$retval .= '</ul>';		return $retval;}function getInfo_helper($descriptor, $name){	$name = "$name: ";	$colon = strpos($descriptor, $name);	if($colon !== FALSE){		$colon += strlen($name);		$semi = strpos($descriptor, ';', $colon);		return substr($descriptor, $colon, $semi);	}}function getInfo($describe){	$included = TRUE;	$describe .= (is_dir($describe) ? '/index.php' : '');	$descriptor = include $describe;	$array['indexable'] = intval(getInfo_helper($descriptor, 'indexable')) ? TRUE : FALSE;	$array['title'] = getInfo_helper($descriptor, 'title');	return $array;}echo getFileList(DIRECTORY_SEPARATOR);?>

Please help!EDIT: I've devised a test which shows that the problem is with the if-statement (or, perhaps, its contents) on the foreach's 14th iteration and NOT its boolean input...test script:

<?phperror_reporting(E_ALL | E_STRICT);$descriptor = 'indexable: 0;';$current_file = __FILE__;switch($_SERVER['HTTP_HOST']) {	case 'localhost':		define('OUTER_ROOT', '/chris/projects/110mb');		define('INNER_ROOT', '/home/chris/www/projects/110mb');		break;	case "www.jesdisciple.110mb.com": case "jesdisciple.110mb.com": case "www.jesdisciple.co.cc": case "jesdisciple.co.cc": default:		define('OUTER_ROOT', '');		define('INNER_ROOT', '/www/110mb.com/j/e/s/d/i/s/c/i/jesdisciple/htdocs');		break;}//require INNER_ROOT . '/includes/index.php';//header('Content-type:text/plain');$indices = array(0, 0, 0, 0);function getFileList($dir){	$retval = '<ul>';	$files = scandir(INNER_ROOT . $dir);	for($i = 0; $i < count($files); $i++){		if($files[$i] == 'index.php' && $i != 0){			$file = $files[$i];//copy it			array_splice($files, $i, 1);//remove it			array_splice($files, 0, 0, $file);//prefix the array with it		}		if(is_dir($files[$i])){			$dirs[] = $files[$i];//copy it			array_splice($files, $i, 1);//remove it		}	}	array_splice($files, -1, 0, $dirs);	foreach($files as $file){		# exclude the current directory and the parent directoryglobal $indices;echo "__0_{$indices[0]}__";$indices[0]++;		$if = $file != '.' && $file != '..';echo "__1_{$indices[1]}__";$indices[1]++;		if($if){echo "__2_{$indices[2]}__";$indices[2]++;			$path = ($dir == DIRECTORY_SEPARATOR ? $dir : $dir . DIRECTORY_SEPARATOR) . $file;			$info = getInfo(INNER_ROOT . (is_dir($path) ? $path . DIRECTORY_SEPARATOR . 'index.php' : $path));			if($info['indexable']){				$dot = strrpos($file, '.');				$name = ucwords(str_replace('_', ' ', is_file(INNER_ROOT . $path) && $dot !== false ? substr($file, 0, $dot) : $file));				if(is_file(INNER_ROOT . $path) && $file == 'index.php'){					$name = 'Home';				}				$constants = get_defined_constants();				$retval .= "<li><a href=\"{$constants['OUTER_ROOT']}{$path}\" title=\"{$info['title']}\">{$name}</a>";				# if this is a directory, get the list for the directory and append it to everything else				if (is_dir(INNER_ROOT . $path) && $file != '/includes'){					$retval .= getFileList($path);				}				$retval .= '</li>';			}		}echo "__3_{$indices[3]}__";$indices[3]++;	}		$retval .= '</ul>';		return $retval;}function getInfo_helper($descriptor, $name){	$name = "$name: ";	$colon = strpos($descriptor, $name);	if($colon !== FALSE){		$colon += strlen($name);		$semi = strpos($descriptor, ';', $colon);		return substr($descriptor, $colon, $semi);	}}function getInfo($describe){	$included = TRUE;	$describe .= (is_dir($describe) ? '/index.php' : '');	$descriptor = include $describe;	$array['indexable'] = intval(getInfo_helper($descriptor, 'indexable')) ? TRUE : FALSE;	$array['title'] = getInfo_helper($descriptor, 'title');	return $array;}echo getFileList(DIRECTORY_SEPARATOR);?>

test output:

__0_0____1_0____2_0____3_0____0_1____1_1____3_1____0_2____1_2____2_1____0_3____1_3____2_2____3_2____0_4____1_4____3_3____0_5____1_5____2_3____3_4____0_6____1_6____3_5____0_7____1_7____2_4____3_6____3_7____0_8____1_8____2_5____0_9____1_9____3_8____0_10____1_10____3_9____0_11____1_11____2_6____3_10____3_11____0_12____1_12____2_7____3_12____0_13____1_13____2_8__PK?ñ8blogs/UT	??__3_13____0_14____1_14____2_9____3_14____0_15____1_15____2_10____3_15____0_16____1_16____2_11____3_16____0_17____1_17____2_12____3_17____0_18____1_18____2_13____3_18____0_19____1_19____2_14____3_19____0_20____1_20____3_20____0_21____1_21____2_15____3_21__<ul><li><a href="/chris/projects/110mb/index.php" title="back to my front page;">Home</a></li><li><a href="/chris/projects/110mb/blogs" title="the Jesdisciple Press :);">Blogs</a><ul><li><a href="/chris/projects/110mb/blogs/index.php" title="the Jesdisciple Press :);">Home</a></li><li><a href="/chris/projects/110mb/blogs/electronic_incantations.php" title="">Electronic Incantations</a></li><li><a href="/chris/projects/110mb/blogs/no,_it's_not_42.php" title="my social blog;">No, It's Not 42</a></li></ul></li><li><a href="/chris/projects/110mb/code" title="snippets, applications, and tutorials;">Code</a><ul><li><a href="/chris/projects/110mb/code/index.php" title="snippets, applications, and tutorials;">Home</a></li></ul></li><li><a href="/chris/projects/110mb/me.php" title="who I am and how to contact me;">Me</a></li><li><a href="/chris/projects/110mb/site.php" title="purpose, design, feedback, etc.;">Site</a></li></ul>

Link to comment
Share on other sites

That bug was apparently caused by my mistake of leaving my archived website in the root folder. The script was trying to include the archive.Now, my menu looks like this:

	* Home	* Blogs		  o Home		  o Electronic Incantations		  o No, It's Not 42	* Code		  o Home	* Me	* Site

I want it like this (any directories at the bottom of each menu):

	* Home	* Me	* Site	* Blogs		  o Home		  o Electronic Incantations		  o No, It's Not 42	* Code		  o Home

Here's what I'm using to sort the items for now:

	for($i = 0; $i < count($files); $i++){		if($files[$i] == 'index.php' && $i != 0){			$file = $files[$i];//copy it			array_splice($files, $i, 1);//remove it			array_splice($files, 0, 0, $file);//prefix the array with it		}		if(is_dir($files[$i])){			$dirs[] = $files[$i];//copy it			array_splice($files, $i, 1);//remove it		}	}	array_splice($files, -1, 0, $dirs);

What am I missing?Thanks!

Link to comment
Share on other sites

I'm not sure what's going on there, but you can keep a separate array for files and directories and merge them before you print.

$file_list = $dir_list = array();	for($i = 0; $i < count($files); $i++){		if(is_dir($files[$i])){			$dir_list[] = $files[$i];		}		else{			$file_list[] = $files[$i];		}	}	$menu = array_merge($file_list, $dir_list);

Before you merge you can sort or do whatever else you want to do.

Link to comment
Share on other sites

I combined your script into mine and the submenus are still at the top, just below Home. I can't seem to get them to the bottom.

	$file_list = $dir_list = array();	for($i = 0; $i < count($files); $i++){		if(is_dir($files[$i])){			$dir_list[] = $files[$i];		}elseif($files[$i] == 'index.php'){			$file = $files[$i];//copy it			array_splice($files, $i, 1);//remove it			array_splice($file_list, 0, 0, $file);//prefix the array with it		}else{			$file_list[] = $files[$i];		}	}	$files = array_merge($file_list, $dir_list);

But let me ask your opinion: Does it seem more natural to you to have submenus at the top as with desktop apps?

Link to comment
Share on other sites

Print file_list and dir_list after the loop ends. If is_dir isn't getting the full path then it won't ever return true and everything will go into the file list. Also, you don't need to use array_splice to remove the element, while looping through the original array it's probably best to leave it unaltered.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...