Jump to content

array vs. hash table?


skaterdav85

Recommended Posts

A hash table is a more complex structure, it's typically implemented using arrays for the elements. The first level is typically an array of a finite size, like if the hash table is storing strings in alphabetic order then the first level would have 1 element for each letter in the alphabet. A string that starts with an A would go into the array of strings that start with A. Most hash tables have several levels, so the container for things starting with A might be another array of each level. So if you were searching for "Arizona", it might look that up in $table['a']['r']['i'], which would be another array of words starting with "ari", and it would search through them for the one it's looking for.Hash tables speed up manual searching because instead of looking through the entire list you can narrow it down based on what the item starts with, and you can narrow down the search quickly and easily. In that example, the hash for "Arizona" is "ari".

Link to comment
Share on other sites

A hash table is one way to store data so that it's easy to search for later. An array is simply a list of values, each one having a key to identify it.Edit: Justsomeguy has a much better answer

Link to comment
Share on other sites

A hash table would definitely be faster than searching an XML file, because all of the data would already be in memory. Of course, you still have to read through the whole file first, but assuming you can save the data structure you would only have to do that once. It's not going to help a lot with a database because many storage engines are already optimized for quick searching. Some storage engines probably use hash tables to speed up searches.

Link to comment
Share on other sites

When you read an XML file into a DOM Document, it's in RAM also, yes? I don't see an advantage to a conversion here.Maybe if you were stuck with a bunch of flat files that had fields named in a relational way, but weren't actually in a database . . .This is kind of legacy technology? Probably in the guts of a DB engine, but not something you'd build yourself.

Link to comment
Share on other sites

When you read an XML file into a DOM Document, it's in RAM also, yes? I don't see an advantage to a conversion here.Maybe if you were stuck with a bunch of flat files that had fields named in a relational way, but weren't actually in a database . . .This is kind of legacy technology? Probably in the guts of a DB engine, but not something you'd build yourself.
Are you saying a hash table is a legacy technology and not something you'd build yourself? I'm new to the concept of hash tables and im not really sure when/where they are needed...
Link to comment
Share on other sites

I'd say a "hash table" is a deeper technology, that is not something you'd build yourself, because most of the stuff you'll use already uses hash tables under the hood.An associative array is a kind of a hash table. In C or C++, you don't have strings as keys to arrays. You only have an offset (as an integer) from the array memory location. The standard C++ template library map.h implements string keys as a generic hash table. Of course, THAT hash table is not optimized at all, but you can see that even in that case, someone has built a hash table.The only people who create optimized hash tables are the people working on the technologies we use - the PHP developers, the MySQL developers, browser developers, [insert C(++)/JAVA application] developers, etc.

Link to comment
Share on other sites

When you read an XML file into a DOM Document, it's in RAM also, yes? I don't see an advantage to a conversion here.
It is, but I'm not sure if the node structure is optimized for searching like a hash table would be.
I'm new to the concept of hash tables and im not really sure when/where they are needed...
It sounds like you're seeing it as a solution looking for a problem. You have the solution, and you want to know what the problem is. Don't worry about that, just understand what a hash table is and if you find a problem that is best solved with a hash table then hopefully you'll remember it. But don't try to fit it into something where it doesn't belong just for the purpose of using it.If you do want to use a hash table just to use a hash table, then build a dictionary application where you have words and definitions stored in a hash table where you look them up. Something like that is a common use for hash tables, but with databases that type of thing is unnecessary, a database will do the job faster.
Link to comment
Share on other sites

It sounds like you're seeing it as a solution looking for a problem. You have the solution, and you want to know what the problem is. Don't worry about that, just understand what a hash table is and if you find a problem that is best solved with a hash table then hopefully you'll remember it. But don't try to fit it into something where it doesn't belong just for the purpose of using it.If you do want to use a hash table just to use a hash table, then build a dictionary application where you have words and definitions stored in a hash table where you look them up. Something like that is a common use for hash tables, but with databases that type of thing is unnecessary, a database will do the job faster.
My reason for asking was because I was asked in an interview "What is the difference between a hash table and an array in PHP?". I wasn't very familiar with the concept of a hash table since I had never used one. The guy told me the answer but I don't quite remember what he said exactly. I think he said something like if you add data to an array, the data is placed at the end whereas in a hash table data is not placed at the end. I guess that would make sense based on what you said about hash tables being fancy multidimensional associative arrays. If you have some data that starts with the letter 'b', then the data wont be placed at the end of the hash table but instead where the key for the first letter is 'b' for searching purposes.
Link to comment
Share on other sites

So either the interviewer is speaking inexactly about PHP without knowing it;orthe interviewer expects you to know that the closest equivalent to a hash table in PHP is an associative arrayorthe interviewer really does expect you to know how native PHP data types can be spun into something that works like a hash table.JSG, do you know if PHP constructs associative arrays with all the efficiency of a true hash table, like managing collisions and memory locations, dynamic resizing, etc?

Link to comment
Share on other sites

Probably not all of them. It doesn't really need to manage collisions, because a collision would just overwrite the value. From what I understand, an associative array in PHP is like a special-purpose hash table where each hash only has one value. Like for this:$ar['foo'] = 'bar';The hash of "bar" is "foo". Obviously a collision on the hash would just overwrite the value in that case. According to this page:http://www.php.net/manual/en/internals2.me....management.phpIt looks like they can use something like erealloc to change the size of the array dynamically.

Link to comment
Share on other sites

EDIT: Wrote this before I saw jsg's post above. It could be all wrong.---Ah. Noticed this at wikipedia:

Hash tables are commonly used to implement many types of in-memory tables. They are used to implement associative arrays (arrays whose indices are arbitrary strings or other complicated objects), especially in interpreted programming languages like AWK, Perl, and PHP.
So I guess this is one of those under-the-hood things, and PHP uses it so you don't have to.That might mean, for example, that if your associative array has 10 unique keys that all point to the same value (something like "Argentina", say), but the assignments were all made uniquely (not copied from the same variable), then PHP is smart enough to know that the value really is the same (we have a collision) and would therefore create only one copy in memory, and all the keys somehow point to it.It would also store the underlying hash in an easily indexable fashion, not just in a random order, the way your script might create it.Good to know.And interesting that an interviewer seems to want candidates who know what's going on under the hood in addition to the stuff you can access directly. This would tend to separate developers who know at least a little computer science (as distinct from programming only).
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...