Gilbert Posted November 16, 2018 Share Posted November 16, 2018 Hi all, I upload a text file to extract info to put into my database on GoDaddy and when I run my php code on it, it tells me that it can't read the file because it is in ansi-xxxx format. In my php code I'm using $var = fgets() to read each line and then put the $vars into the correct table of the database. So I have clicked the button at the top of the code editor and converted the text file to utf-8 - but the conversion leaves the file with 2 odd characters at the beginning of the file and puts a blank line between each line. When I delete the 2 characters and the blank lines and I run my code, everything works as it should and updates my tables. My question is: Can I do a conversion on the ANSI text file using php without any manual manipulating? I've read about the utf-8_encode(), but it says it encodes an ISO-8859-1 file, but mine is an ANSI, or is the ISO-8859-1 an umbrella to a lot of different codes? Can I convert the whole file at once or do i set up a loop to read a line, convert it and write to a new file? Am I interpreting this correctly? I'd appreciate a code snippet so I can see how to set it up - or a reference to more reading so I can learn. Thank you very much! Link to comment Share on other sites More sharing options...
dsonesuk Posted November 16, 2018 Share Posted November 16, 2018 Save the file as encoded utf-8, problem gone! Its within the 'Save AS' pop-up pane, it has File name:, Save as type and under that Encoding: with default ANSI. Link to comment Share on other sites More sharing options...
justsomeguy Posted November 16, 2018 Share Posted November 16, 2018 the conversion leaves the file with 2 odd characters at the beginning That's the UTF-8 BOM. You can use this to strip the BOM from the beginning of a string: // check for a variety of byte order marks at the beginning of the string and remove them if present function strip_bom($str) { $boms = [ pack('CCC', 0xef, 0xbb, 0xbf), # UTF-8 pack('CC', 0xff, 0xfe), # UTF-16 (BE) pack('CC', 0xfe, 0xff), # UTF-16 (LE) pack('CCCC', 0x0, 0x0, 0xfe, 0xff), # UTF-32 (BE) pack('CCCC', 0xff, 0xfe, 0x0, 0x0), # UTF-32 (LE) ]; foreach ($boms as $b) { if (substr($str, 0, strlen($b)) == $b) { return substr($str, strlen($b)); } } return $str; } You can also check if the line is empty and skip it if so. while (($line = fgets($handle, 4096)) !== false) { $line = trim(strip_bom($line)); if ($line === '') { continue; } // process $line } You can also convert a character encoding: http://php.net/manual/en/function.mb-convert-encoding.php http://php.net/manual/en/function.iconv.php If your database is set up to store UTF data, make sure you insert data using the correct encoding. Obviously, if the file starts with the correct encoding then you don't need to do anything special. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now