Jump to content

Searching Text Variants


Skemcin

Recommended Posts

So there use case goes like this:A member of the site is allowed to edit their user profile (as most would expect). They have the ability to enter their full address information and the state is a drop box as is the country and even county/region. The street address and city/town are regular text inputs just as with their first and last names. So for, no big deal, typical run of the mill forms and db columns.The challenge now is when a tool is built to search certain areas of this information - mainly the city/town. Without any tricks and just by doing your typical SQL statements, it is inevitable that two users living int he same town will edit their information differently. For instance, one person will put their city/town as "St. Vincent" and the other "Saint Vincent" and yet one more as "St Vincent". I know that address verification services are available and building a tool to correct or suggest corrections could be build using a address verification web service - if so subscribed. But if my client doesn't wish to pay the subscription service and there is no means of ensuring each person knows the right way to enter the information, what techniques can be used to produce more accurate search results?Lets forget about convincing the client to ###### it up and buy address verification and lets also skip past the "tough luck, thats just the way its gonna be, nothing is ever really going to catch everything" replies.Is there a better way of tackling this from the database perspective on insert or update or is it better managed on the programming side either on the edit page or manipulating the search programming?

Link to comment
Share on other sites

Just so we're all clear, you're looking for a solution to search the database of users based on data that is input by those users? You mentioned that "when a tool is built [you want] to search certain areas of this information - mainly the city/town" - ostensibly to locate users in the system whom you could contact to announce this new tool?Perhaps you could set up two additional tables: one that stores various regions and one that stores the relationships between users and regions. You could run a query to determine which users do not have an associated region, load those users up on some form, and have an administrator decide, based on the city/state, to which region that user needs to be assigned. Then, when new tools are announced, you could contact the users based on certain regions. If the user edits their information, the relationship in the UsersToRegions table could be deleted and the process could begin from scratch for that user.Of course, that solution would become entirely unweildly depending on the number of users in the system.

Link to comment
Share on other sites

Thanks for the reply, and sorry for the confusion my initial post might have conveyed.In short, I need to fix a clients search interface so that it can return better results. The clients members enter their own data so there is very little control over that. So when a current search for "St. Vincent" is done, the results do not include "St Vincent" and "Saint Vincent".While I work on the client to integrate address verification on the member entry and edit form and while I work on using AJAX for suggesting the existing distinct values currently in the database, I was wondering if there were other ideas or techniques that I might consider to improve the search results.

Link to comment
Share on other sites

So when a current search for "St. Vincent" is done, the results do not include "St Vincent" and "Saint Vincent".
Have you looked into using FullText indexing on that column in that table? Using FullText searching, the search for "vincent" would return "Saint Vincent", "St. Vincent", "st vincent", etc. I don't know, however, if it would return "Saint Vicent", "St. Vicnent", or some other mispelling.
Link to comment
Share on other sites

You may want to store the original address as typed into one field, and have a second field where you store search keywords. In the search keywords field you would remove words like "st", "st.", "saint", etc that have different variations and store the rest of it. So even if someone just searched for "vincent" it would return the other results as well. It seems like a lot of systems are starting to do that though - saving the user's info as-entered in one place but actually searching in another place that was designed to hold keywords. The phpbb database is about 90% composed of search information, the users and actual posts together are probably only around 10% of the total database size.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...