Jump to content

Counting Characters with the var_dump() Function


iwato

Recommended Posts

Holiday Greeting: Happy New Year in the Year of the Rabbit 2011!Background: Below you will find two blocks of code that are identical in every way, but the use of quotation marks around the escaped character \n. Below each set of code you will also find the output obtained from a variable dump and an echo statement associated with each block of code.Question: Why do both variable dumps yield the same number of characters when clearly the value of $packed_str is not empty in the first case, but is empty in the second? Is var_dump counting only the NUL characters expressed by the $format parameter of the pack() function? What is going on?Block One - Single Quotation Marks

$packed_str = pack('a2', '\n');var_dump($packed_str); echo '<br />';echo $packed_str;

Outputstring(2) "\n" \nBlock Two - Double Quotation Marks

$packed_str = pack('a2', "\n");var_dump($packed_str); echo '<br />';echo $packed_str;

Outputstring(2) " " Roddy

Link to comment
Share on other sites

The second case is actually a new line and a null character... that's visible with XDebug on... So that's two characters right there.How come a null character appears as part of a string only in the second case and what exactly is the pack function really doing, I don't know... but why do you keep trying to use this pack() function? Unless perhaps you were trying to construct an SQL-less DB, this function is pointless, especially in today's day and age when 1KB memory is not as much as it once was.While I don't know for sure how pack() works, the fact that null appears as part of a string may be (for you "another") PHP bug.

Link to comment
Share on other sites

The second case is actually a new line and a null character... that's visible with XDebug on... So that's two characters right there.
Thank you for this very useful clarification -- obviously not something that was readily visible to the "naked" eye.In answer to your question
... but why do you keep trying to use this pack() function? Unless perhaps you were trying to construct an SQL-less DB, this function is pointless, especially in today's day and age when 1KB memory is not as much as it once was.
Because there are apparently a lot of "idiots" who do not know what to do with their time and mislead novice beginners like me into exploring avenues of digression that maybe I should not. In any case, it was presented as a means to convert from Chinese code written with HTML-ENTITIES into Arabic code written with UCS-2BE.
While I don't know for sure how pack() works, the fact that null appears as part of a string may be (for you "another") PHP bug.
No comment.Roddy
Link to comment
Share on other sites

QUOTE While I don't know for sure how pack() works, the fact that null appears as part of a string may be (for you "another") PHP bug.No comment.
What does the manual for pack say when you use "a" as the format character?
Link to comment
Share on other sites

What does the manual for pack say when you use "a" as the format character?
The formatting character 'a' represents a NUL character.The formatting string 'a2" represents two NUL characters side-by-side.Does the var_dump() function count NUL characters when it returns the length of a string?Roddy
Link to comment
Share on other sites

:) :) I forgot to re-read the actual pattern... it now makes sence.
NUL-padded string
Taken out of context, this is vague... but..."a2" means "two bytes long NULL-padded string". For a string to be "padded" means that if it is less than a certain length, it will be filled up with a certain character until it is that length. In this case you have a string that is at least two bytes, and if less, it will be padded with NULL bytes until it is two bytes. That also explains why only the second example has a NULL in it - a newline is one byte/character, so there's one byte to be filled with NULL.
In any case, it was presented as a means to convert from Chinese code written with HTML-ENTITIES into Arabic code written with UCS-2BE.
Huh?Why would you want to do that? If you store everything as UTF-8, you can write any character you want with no entities or escape sequences. And you can use things like iconv to actually convert non-UTF-8 stuff to UTF-8. Yes, for some more exotic characters, UTF-8 may take more bytes than other encodings, but the robustness and stability your app gets is well worth it.
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...