Jump to content

Big data comparison problem... .TXT -> ASP -> SQL


Lee-yoshi

Recommended Posts

Hello!I have a big problem, and after much thought i cannot come up with a pratical solution, so i figured i'd see what you guys thought?I have a MySQL table that could possibly exceed 1million records one day! (It's pretty big!) I also have a text file which contains many, many thousands of records.The problem is that i need to compare one record from the text file to find whether or not the record already exists in the MySQL table or not, and if it does update the sql record, else insert it.Obviously since there are so many comparisons that need to take place, it can't all be done in one load of an Internet Explorer page, and i've tried spreading it out amongst many IE page loads (Checking only 500 at a time or something) but it's either taking way too long (Upto a day!) or it's breaking on me because it's checking so much in one go!!!So my question is, without me having to build a program, is it possible to do this either online (Or offline) in ASP? Is there a much simpler solution?Any thoughts & feedback/comments anyone might have would be appreciated :)

Link to comment
Share on other sites

If using ASP is a requirement, then you should be able to set the timeout for how long you want the page to run, like a minute, and maybe schedule it to run every minute. So it will basically be running constantly as long as there are records to process. You can schedule a task to open a browser to your page and disable caching on it to ensure that it always runs the code, and have it close the window after a certain amount of time. The down side is that you'll basically have browser windows constantly opening and closing.

Link to comment
Share on other sites

Yea i've already had a play with the server.timeout settings, but the problem is it only gets through about 100 record comparisons from .txt file to MySQL database, and takes many many hours to get through the entire .txt file through page refreshing :)The only real reason ASP is a requirement is because it's all i really know! Is there a better way of doing it?Or any other work arounds?Is there a way you can remove lines from a .txt file without having to rewrite the entire thing? (Thus making the file smaller and smaller to check through making it quicker every cycle) I can only find functions that skip lines and write lines.

Link to comment
Share on other sites

ASP isn't exactly the best tool for this job, it's not very fast. Any compiled language would be faster. It's possible that part of the problem is the actual code, you might not be doing things as efficiently as you could. It would be faster, for example, to group SQL queries into batches of several hundred and send them all at once instead of sending queries individually. The MySQL ASP driver might not support multiple queries at once though. It would also be faster, although much more memory-intensive, to get all of the lookup data at once and put it in a random-access data structure like an array than it would be to send a SQL query to look up whether each individual item exists. The goal is to minimize the number of times you need to go to the database. If you're using VBScript, then for a random-access data structure you're limited to a dictionary object, and those are horribly, horribly slow. If you're using Javascript then you could store it all in an array or generic object, which would be much faster to look up than using an ASP dictionary object. But still, any compiled language would blow ASP out of the water. If it's only processing 100 items in a minute then it sounds like you're sending too many queries.

Is there a way you can remove lines from a .txt file without having to rewrite the entire thing?
No, but you can read lines without needing to read the entire file. The way you remove lines is to get the text of the file, remove from that whatever you want to, and write it back.
Link to comment
Share on other sites

So you're saying if i try to program this comparison search in Javascript (From what i gather is very similiar to C++) instead of ASP, it'll perform quicker?I have also thought about storing the data (Or even simple data such as a '0' or '1' to state whether to read or skip each record depending on whether it's found matches or not) but the problem again there is that this .txt file contain over 250,000 records, hence my horrendous problem!! The size of the array would mean it's too big to store in a cookie to save and work on between page loads. Is it possible to send an array in javascript from one page to another?And yes remvoign individual lines from a .txt file by just writing the rest of them is the only way i could think up aswell, but again it's a timely process when the .txt file is so large :)

Link to comment
Share on other sites

You can use Javascript instead of VBScript to write ASP. ASP isn't an actual language, it's more of a tool kit. Technically it's a framework. Using Javascript gives you access to more powerful data structures than VBScript supports, and Javascript is generally faster overall. You may want to consider ASP.NET, which will get compiled to an executable before the server runs it. You can write .NET code with several additional languages, like C#.For moving data structures between web pages, it's most common to use the session. You can also serialize the data structure so that it's a string, write the string to a file or database, and then you can read the string and unserialize it back into the original data structure. The JSON format is one way to do that easily in Javascript. If you just store an array in the session the server will take care of serializing that.

Link to comment
Share on other sites

  • 3 weeks later...

So if i use ASP.NET, surely it'll still have to search through the database everytime for each record, unless you mean i should get all the data from the database and put it into a big array of some kind and then compare against that?Surely it'll always be a long process.I'm struggling so much with this it's shocking :)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...