Jump to content

The unpack() Function


iwato

Recommended Posts

QUESTION: What must one do to display a binary data string in its raw format -- namely a series of bits composed of only zeros and ones.BACKGROUND: In my quest to understand the md5() function thoroughly I created a little function and a bit of code to test all the different ways to read a binary string offered by the pack() and unpack functions. Of the 21 different ways available to unpack a binary data string I was able to achieve only 18. For some reason my little bit of script failed with data-type 'x'.

Fatal error: Maximum execution time of 30 seconds exceeded in ../md5.php on line 98.
In any case, I was amazed by what I obtained. Each of the following sets of output corresponds to one of the format types available for reading a string of binary data. See the pack() function for a brief summary of the possible data-type formats. Unfortunately, none of these produced that for which I was looking -- namely, the binary code, itself.
a*8p'OlIg(A*8p'OlIg(h*f18307eb72f4c6943b3ea1c0768259f7H*1f3870be274f6c49b3e31a0c6728957fc*3156112-66397910873-77-29261210340-107127C*3156112190397910873179227261210340149127s*14367-167842026318796-724530981034332661S*143674875220263187965829130981034332661n*79922886210063277214605166682640838271v*143674875220263187965829130981034332661i*-109994185712318349192030888192140481639I*-109994185712318349192030888192140481639l*-109994185712318349192030888192140481639L*-109994185712318349192030888192140481639N*523792574659516489-12769622921730712959V*-109994185712318349192030888192140481639f*-0.234589084983967922.43751.19322491986E-31NANd*5.05053008237E+453.71438203471E+306x*

Link to comment
Share on other sites

I'm not sure why you have problems with the x thing... it could be another PHP bug (you, bug hunter!).The only function I was able to find that would show a string representation of a binary number is decbin(). It takes a decimal number, and returns its binary representation as a string.I think you can get the binary representation of binary data like this:

echo decbin(hexdec(bin2hex($bin)));

Where $bin is the binary data, ala md5('apple', true).You could also omit the bin2hex() if you DON'T use the raw output.

Link to comment
Share on other sites

I'm not sure why you have problems with the x thing... it could be another PHP bug (you, bug hunter!).
Maybe I should change my avatar from a dead dragon-fly to a live one -- the latter being clearly more difficult to obtain. In any case, the dragon-fly was suppose to bring me good luck.The following clearly fails.
The only function I was able to find that would show a string representation of a binary number is decbin(). It takes a decimal number, and returns its binary representation as a string.I think you can get the binary representation of binary data like this:
echo decbin(hexdec(bin2hex($bin)));

Where $bin is the binary data, ala md5('apple', true).You could also omit the bin2hex() if you DON'T use the raw output.

Link to comment
Share on other sites

I forgot that like C, PHP uses long integers to represent... well... integers. A 64 bit version of PHP should work I think, but PHP-32bit can only work with up to

decbin(hexdec('fffffffffffff'))

or down to

decbin(hexdec('-fffffffffff'))

(according to my tests...)So that's up to 13 hex characters... and md5 uses 32.Hm... Oh well... you'll have to use a custom function, since PHP lacks one. Here's one I just made up (and this time tested):

function hex2bin($hex, $raw = false) {	$result = '';	$length = strlen($hex);	if ($raw) {		for($i = 0;$i < $length; $i++) {			$result .= str_pad(dechex(ord($hex[$i])), 2, '0', STR_PAD_LEFT);		}		return hex2bin($result);	}elseif(preg_match('/^([0-9]|[A-F])+$/i', $hex)) {		for($i = 0; $i < $length; $i++) {			$result .= str_pad(decbin(hexdec($hex[$i])), 4, '0', STR_PAD_LEFT);		}		return $result;	}else {		return false;	}}

For the raw hash, specify the second parameter to true. If you have an actual hex string (ala without the second parameter to md5), leave it to false.

Link to comment
Share on other sites

Here's one I just made up (and this time tested):
Firstly, I would like to thank you for your enormous effort. It is certainly beyond anything that I am able to do at this time in this regard. However, before making a sincere effort to understand what you did, I decided to check its validity. Here is what I discovered:1f:38:70:be:27:4f:6c:49:b3:e3:1a:c:67:28:95:7f - My reversal of your binary string.1f:38:70:be:27:4f:6c:49:b3:e3:1a:0c:67:28:95:7f - The original md5('apple', 0) hash.It is essentially a weakness in my reconstruction or a weakness in your construction, and I am not sure which. Notice the 12th pair of hexidecimal characters reading from left to write. Whereas the original string indicates '0c', my reconstruction of your binary string yields only 'c'. This is the code that I used.
$hex = md5('apple', 0);$bin = hex2bin($hex, 0);$bin_8bit = substr(chunk_split($bin, 8, ' '), 0, -1);$bin_8bit_arr = explode(' ', $bin_8bit);foreach ($bin_arr as $hexbin) {	$hexbin = dechex(bindec($hexbin));	$hex_arr[] = $hexbin;}$hex_str = implode('',$hex_arr);

If you compare $hex_str with $hex, you will discover that they do not match.Roddy

Link to comment
Share on other sites

You can change that to pad the string with zeros so that it's always at least 2 characters. What you're seeing is the difference between a number like 123 and 0123, it's just that it doesn't show leading zeros when it's a number.

Link to comment
Share on other sites

You can change that to pad the string with zeros so that it's always at least 2 characters. What you're seeing is the difference between a number like 123 and 0123, it's just that it doesn't show leading zeros when it's a number.
Yeah, but what happens when one changes the nibble order? It appears that one must write separate code for each eventuality?Roddy
Link to comment
Share on other sites

The nibble order (which up until now I knew as "byte significance"...) is important when you are supposed to interpret binary to something more specific... when it's supposed to be interpreted, as opposed to blindly translated, as these convertions are. It basically dictates whether "1f" will be read as "1f" (31) or "f1" (241).In a blind translation from one base to another which is of power two, the only thing you need to do is translate each digit into binary, and pad it up to the number of bits in the highest digit takes. "F" is "1111"... 15... 4 bits... so any digit (0 included) needs to be padded up so it is written as 4 bits. If binary was not the desired base but, say, octal was the desired one, you'd have to then reparse the binary string (from right to left) with 3 bits for each digit.Those are the kinds of things they taught us at programming classes :) . But they can also make sence when you read up on Wikipedia.

Link to comment
Share on other sites

For the raw hash, specify the second parameter to true. If you have an actual hex string (ala without the second parameter to md5), leave it to false.
OK, I have finally found the time to go through your code and can confidently say that I understand everything, but your use of the preg_match() function. According to the online PHP manual, the preg_match() function only finds one match and then quits. If, indeed, you are testing for a complete hexidecimal string, similar to the one that you create in the prior if-statement, why do you not employ the preg_match_all() function. After all, as it is currently written, does your code not test for only the first hexidecimal pair?
function hex2bin($hex, $raw = false) {	$result = '';	$length = strlen($hex);	if ($raw) {		for($i = 0;$i < $length; $i++) {			$result .= str_pad(dechex(ord($hex[$i])), 2, '0', STR_PAD_LEFT);		}		return hex2bin($result);	}elseif(preg_match('/^([0-9]|[A-F])+$/i', $hex)) {		for($i = 0; $i < $length; $i++) {			$result .= str_pad(decbin(hexdec($hex[$i])), 4, '0', STR_PAD_LEFT);		}		return $result;	}else {		return false;	}}

Roddy

Link to comment
Share on other sites

The "^" and "$" modifiers ensure that the string is tested from start to end... Without them, the preg_match() would indeed stop at the first digit, despite the "+". I'm not sure what would happen if you pass a string with newlines though, but if problematic, that's an easy fix:

}elseif(preg_match('/^([0-9]|[A-F])+$/im', $hex)) {

An alternative approach, using preg_match_all() would involve actually counting the matches, and making sure they're the expected number, like:

}elseif(preg_match_all('/[0-9]|[A-F]/im', $hex) === $length) {

Link to comment
Share on other sites

The "^" and "$" modifiers ensure that the string is tested from start to end.
Got it! Finally, an practical example of the ^ and $ escape characters that I have understood. No, fault of you own for my reluctant comprehension, mind you.Thanks.Roddy
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...