justsomeguy Posted March 8, 2013 Share Posted March 8, 2013 I'm still doing research, but I thought I would ask here to see if anyone has run into this before. I'm trying to parse a CSV file that is saved as UTF-8 with BOM. This is the code I'm testing: <?php header('Content-Type: text/html; charset=UTF-8'); ini_set('display_errors', 1);error_reporting(E_ALL); $fname = 'French_Language_Upload_File_full_utf.csv'; $contents = file_get_contents($fname); echo '<pre>' . print_r(mb_detect_order(), true) . '</pre><br>';echo 'encoding: ' . mb_detect_encoding($contents) . '<br>'; echo '<pre>' . $contents . '</pre><br>'; $csv = new SplFileObject($fname);$csv->setFlags(SplFileObject::READ_CSV | SplFileObject::READ_AHEAD | SplFileObject::SKIP_EMPTY); echo '<pre>';foreach ($csv as $row){ print_r($row);}echo '</pre>'; What I'm seeing is that some lines that start with certain characters are having those characters stripped. One of the lines in the CSV file contains this, for example: common,inactive_students,Inactive students,Inactive students,Étudiants inactifs After CSV parsing, it looks like this: Array( [0] => common [1] => inactive_students [2] => Inactive students [3] => Inactive students [4] => tudiants inactifs) This only happens with lines that start with the certain characters, if that same accented capital E appears in the middle of the text it does not get lost. It also doesn't strip all characters, this one is correct: Array( [0] => common [1] => ug_admin_news_permission [2] => At this time User Group Admins are only able to add news items to their own user groups. [3] => At this time User Group Admins are only able to add news items to their own user groups. [4] => À l’heure actuelle, les administrateurs peuvent seulement ajouter des nouveaux items à leurs propres groupes d’utilisateurs.) This line, though: common,r_u_sure_add,Are you sure you want to add the selected users?,Are you sure you want to add the selected users?,Êtes-vous sûr(e) de vouloir ajouter les utilisateurs sélectionnés ? Ends up like this: Array( [0] => common [1] => r_u_sure_add [2] => Are you sure you want to add the selected users? [3] => Are you sure you want to add the selected users? [4] => tes-vous sûr(e) de vouloir ajouter les utilisateurs sélectionnés ?) This also happens with fgetcsv. Has anyone run into this before? Any ideas? The server is running PHP 5.2.17. Link to comment Share on other sites More sharing options...
justsomeguy Posted March 8, 2013 Author Share Posted March 8, 2013 This problem does not appear if I wrap all strings with quotes, so now I have to figure out how to force Excel to put quotes around everything. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now