Jump to content

IanArcher

Members
  • Posts

    115
  • Joined

  • Last visited

Posts posted by IanArcher

  1. You're not printing $image[$cid], you're printing $filename. I would need to see all of your code though, it looks like you've still got 2 arrays (an array of CIDs, and an array of filenames) when you should only have 1 that contains both.
    Here's a copy of it my all of my code:http://pastebin.com/bRARHt0u
  2. The only thing you're telling it to print is the filename. If you want to print the entire array then use $image instead of $filename.
    I know, but what i'm trying to understand is, why the combination of $image and [$cid] prints the filename of the second image, even if i do not define these variables as their img tag source matches and cid source matches, it appears anyway. Why is that?
  3. that is the example. instead of your numerically indexed array starting at 0, use the CID as the indexes instead.
    (	["813DEC0642E941F0845447B680DA566A"] => Image1.gif,	["6829CB8F2D5D43AAB115DE06572E770A"] => Image2.gif}

    I did this:
    	$srcMatch[2][0] = $image;	$cidMatch[1][0] = $cid; 	$image[$cid] = $filename;	print_r($filename);

    and it prints

    Image2.gif

    For learning purposes, Howcome it prints the filename of the image though?i also changed the indexes to 1, for it to look for Image1[2] instead

        $srcMatch[2][1] = $image;    $cidMatch[1][1] = $cid;

    But it still prints Image2[2].gif

  4. Don't get all of the headers and write them to a text file. Get each set of headers individually, process that one set (get the name and CID from it), and add that to the array: $images[$cid] = $filename; Then get the next set of headers and process those. The end result should be an array that contains a mapping between CID and filename values, not a text file that contains all of the headers from every section in the email.
    I don't really follow that last example you gave:
    $images[$cid] = $filename;

    Could you give me an example/sample please?

  5. The way I see it, you need this information: From the headers, you need the content-type and content-id headers. The content-type header contains the filename that you can extract, and the content-id header contains the CID used in the email. You need to process those to get a mapping between the CID and filename, so that you can lookup a particular CID to find out what the original filename was. If you store those in an associative array where the CID is the key and the filename is the value, that will make that part easy. Once you have that, you can get all of the img tags from the email and loop through them. For each img tag, you need to check the src to see if it starts with "cid:". If it does, you extract the CID from the src and look up the original filename in the array described above. Then you replace the value of the src attribute to point to the file instead of the CID. I'm not sure what you're asking about with regard to creating a scope. Do you want to create a namespace to store variables in?
    When i say scope i mean for e.g.:mimheaders.txt:
    Content-Type: multipart/alternative;	boundary="----=_NextPart_001_0015_01CCD6A7.473AA550" Content-Type: image/gif; name="Image1[2].gif"Content-Transfer-Encoding: base64Content-ID: <813DEC0642E941F0845447B680DA566A@BrinardHP> Content-Type: image/gif; name="Image2[2].gif"Content-Transfer-Encoding: base64Content-ID: <6829CB8F2D5D43AAB115DE06572E770A@BrinardHP>

    how to distinguish between which name= attribute belongs to which Content-ID. So there presents the matter of creating a scope of a certain range, so the script would know where to look.PHP says: "Oh so you have this CID match but there are multiple images in the entire email's mime headers? So let's just shrink the range to the correct placement, and scope the proper name element match."So the above code example for mimeheaders.txt, becomes:

    Content-Type: image/gif; name="Image2[2].gif"Content-Transfer-Encoding: base64Content-ID: <6829CB8F2D5D43AAB115DE06572E770A@BrinardHP>

    PHP says: "So we noticed this CID match is on the same line as the Content-ID: and lines not to far from it, there is a name attribute called "Image2[2].gif", this must be the name match your looking for".This is really what i'm getting at you know?

  6. This is how it appears:The first array is the matches from the <img> tag and the second it the matches from the name elements of the MIME headers And still there's that issue of creating a Scope.

    Matched image tag sourcesArray(	[0] => cid:813DEC0642E941F0845447B680DA566A@BrinardHP	[1] => cid:6829CB8F2D5D43AAB115DE06572E770A@BrinardHP	[2] =>	[3] =>	[4] =>	[5] =>	[6] =>	[7] =>	[8] =>	[9] =>	[10] =>	[11] =>	[12] =>) Matches foundArray(	[0] => Image1[2].gif	[1] => Image2[2].gif	[2] =>	[3] =>	[4] =>	[5] =>	[6] =>	[7] =>	[8] =>	[9] =>	[10] =>	[11] =>	[12] =>) 

  7. I followed your advice by seeing how you link up the attachments with the html body andIn the first for loop in your script you provided i included this:

        $fp = fopen('mimeheaders.txt', 'w');    $mimepart = imap_fetchmime($mbox, $no, 1);    //1st part    $mimepart .= imap_fetchmime($mbox, $no, 2);    //2nd part    $mimepart .= imap_fetchmime($mbox, $no, 3);    //3rd part    $mimepart .= imap_fetchmime($mbox, $no, 4);    //4th part    $mimepart .= imap_fetchmime($mbox, $no, 5);    //3rd part    $mimepart .= imap_fetchmime($mbox, $no, 6);    //4th part    $mimepart .= imap_fetchmime($mbox, $no, 7);    //4th part    $mimepart .= imap_fetchmime($mbox, $no, 8);    //3rd part    $mimepart .= imap_fetchmime($mbox, $no, 9);    //4th part    $mimepart .= imap_fetchmime($mbox, $no, 10);    //4th part    $mimepart .= imap_fetchmime($mbox, $no, 11);    //3rd part    $mimepart .= imap_fetchmime($mbox, $no, 12);    //4th part    fwrite($fp, $mimepart);    fclose($fp);    $rawmime = file_get_contents('mimeheaders.txt');

    So in the mimeheaders.txt file, we have:

        Content-Type: multipart/alternative;boundary="----=_NextPart_001_0015_01CCD6A7.473AA550"Content-Type: image/gif; name="Image1[2].gif"Content-Transfer-Encoding: base64Content-ID: <813DEC0642E941F0845447B680DA566A@BrinardHP>Content-Type: image/gif; name="Image2[2].gif"Content-Transfer-Encoding: base64Content-ID: <6829CB8F2D5D43AAB115DE06572E770A@BrinardHP>

    Now i wrote this to match all <img> tags in the email body:

     if (preg_match_all('/<img\s+([^>]*)src="(([^"]*))"\s+([^>]*)(.*?)>/i', $html_part, $srcMatch)) {    echo 'Matched image tag sources';    echo '<pre>';$srcArr =    array(                            $srcMatch[2][0],        //list up to 12 image matches                            $srcMatch[2][1],                            $srcMatch[2][2],                            $srcMatch[2][3],                            $srcMatch[2][4],                            $srcMatch[2][5],                            $srcMatch[2][6],                            $srcMatch[2][7],                            $srcMatch[2][8],                            $srcMatch[2][9],                            $srcMatch[2][10],                            $srcMatch[2][11],                            $srcMatch[2][12],    );    print_r($srcArr);    echo '</pre>';    } else {    echo 'No matches found for image tag';    }

    I wrote this to match the filename/name:

    //Here we are setting up real filename matches    // var $rawmime (See imapScript.php)    if(preg_match_all('/Content-Type:(.*);\s+name="(.*)"/', $rawmime, $nameAttr)){    echo 'Matches found';    echo '<pre>';$nameArr = array(                                $nameAttr[2][0],    //number of array items equiv. to source                                $nameAttr[2][1],                                $nameAttr[2][2],                                $nameAttr[2][3],                                $nameAttr[2][4],                                $nameAttr[2][5],                                $nameAttr[2][6],                                $nameAttr[2][7],                                $nameAttr[2][8],                                $nameAttr[2][9],                                $nameAttr[2][10],                                $nameAttr[2][11],                                $nameAttr[2][12],                                );                            print_r($nameArr);    echo '</pre>';    } else {    echo 'No match, Check your regular expression';    }

    I tried something like:

    if($srcMatch[2] == $nameAttr[2]){preg_replace($srcMatch, $nameAttr, $html_part);}

    But i receive parameter errors, about string expections, and array given, So i would use an implode function to convert it to string, but even then, the preg_replace expects delimiters. But a bigger problems is i want to create a Scope that the script will know where to look when look for filename/name elements, instead of matching all of them at once. How can i do both of these? Or i there something i'm completely doing wrong?Because i just want to replace the name match with the correct img tag source match.

  8. That loop is going to replace the contents of every "src" attribute with every image in the array. When it finishes, all "src" attributes will point to the last filename that was in the array, regardless of what they started as. Like I said above, you'll need to do some testing. Save the original HTML email, get the list of information for all of the attachments, and examine the HTML and the attachment information to see how you link up the attachments. I would assume that the CID is given in the information for the attachments, but you need to verify that.
    No CID won't always be the case, sometimes those <img> tags may actually carry the original filename of the picture. But i'll try analyzing between the attachments and the HTML.
  9. You would use either a for loop or a foreach loop to go through the array of attachments. http://www.php.net/m...res.foreach.php
    Ok so for example, the script list all images in the directory as an array. Now I have
    foreach($imgmatch as $image){preg_replace('/src="(.*)"/', '/src="$image"/', $html_part)}

    What concerns me is when i have images that are shrouded in <img> tags like this:

    src=3D"cid:813DEC0642E941F0845447B680DA566A@BrinardHP"

  10. The full reference for that is actually:

    <img style=3D"width:213px=3Bheight:213=px=3Bborder:0=3B" src="https://******.storage.live.com/y1pZinLnQnaoBClW=RMsQ5sAzEF1H4HGond-KoaAjdcEX0GCp9HzrWa2RSJwI9ngxR7WIub5M9Ps810/cloudsprite.=jpg">';

    Is there a function in the manual that could help me loop through? I've looked for a while and haven't come across any.

  11. I think it's time to follow your advice and find create a 'universal' image replacement.Because i have completed the tested the handling of images for Microsoft Outlook:

    /***************************** 1st image in email**********************************/if (preg_match('/cid:([^"@]*).(png|jpg|jpeg|gif|bmp)@([^"]*)/', $html_part, $m)){ $find = '/cid:'.$m[1].'.'.$m[2].'@([^"]*)/'; if ($m[2] == 'png') $replace = $png1;if ($m[2] == 'jpg') $replace = $jpg1;if ($m[2] == 'gif') $replace = $gif1;if ($m[2] == 'bmp') $replace = $bmp1;if ($m[2] == 'jpeg') $replace = $jpeg1;$html_part = preg_replace($find, $replace, $html_part);} /***************************** 2nd image in email**********************************/						if (preg_match('/cid:([^"@]*).(png|jpg|jpeg|gif|bmp)@([^"]*)/', $html_part, $m)){ $find = '/cid:'.$m[1].'.'.$m[2].'@([^"]*)/'; if ($m[2] == 'png') $replace = $png2;if ($m[2] == 'jpg') $replace = $jpg2;if ($m[2] == 'gif') $replace = $gif2;if ($m[2] == 'bmp') $replace = $bmp2;if ($m[2] == 'jpeg') $replace = $jpeg2; $html_part = preg_replace($find, $replace, $html_part);}

    But when it comes down to other email clients i've tested:Windows Live Mail:

    src=3D"cid:813DEC0642E941F0845447B680DA566A@BrinardHP"

    Hotmail:

    src="https://*****.storage.live.com/y1pZinLnQnaoBClW=RMsQ5sAzEF1H4HGond-KoaAjdcEX0GCp9HzrWa2RSJwI9ngxR7WIub5M9Ps810/cloudsprite.=jpg"

    iPhone's default mail app

    src="cid:1D8E2297-F260-4555-8223-304A9DB08CC5/image.jpeg"

    So instead of spending days possibly coding an image handler based on a specific email clients cid: references, your idea of looping from all the images in a directory is more feasible, but i truly have no idea where to start off with doing that. Any ideas?

  12. That pattern says to match anything that is not a double quote or @ character. When the character class starts with ^, it means match anything except what is listed. So it will stop matching once it finds a double quote or @ sign.
    Oh thanks for letting me know..is there a regex manual equivalent to the php manual.I'm only aware of regular-expressions.info
  13. I must ask for learning purposes.Why does this snippet of code even with the 's' modifier not get multiple lines

    $content = preg_match('/Message: ([^"@]*)/s', $html_part, $matches) ? $matches[1] : '';

    But the code below does

    $content = preg_match('/Message: (.*)/s', $html_part, $matches) ? $matches[1] : '';

    The regex in the brackets are different, but what characters or other entities does the [^"@]* exclude

  14. I used

    mysql_real_escape_string($content, $db);

    before the SQL injection and i stil receive the same error. Now here i change $html_part to $text_part and still the injection has a problem with any kind of quotes.

    $content = preg_match('/Message: ([^"@]*)/s', $text_part, $matches) ? $matches[1] : '';

    Result:

    ERROR: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 't work with humans as they can adapt and enter whatever is required, including ' at line 1
  15. In the code I posted in post 12, the $raw_headers variable contains all of the headers from the message. It uses the imap_fetchheader function. The $original_message variable contains the entire unprocessed body of the email. The original email as received by the mail server is the headers, then an empty line, then the body.
    Ok thanks, i got it done.Also this is a snippet of code from my script. Could you possibly tell me why whenever i run it to insert into the table, i get this error with a test email with just text and paragraphs:
    ERROR: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'font-size:12.0pt'>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse lib' at line 1
    I looked at the source formatting and saw that the style for this paragraph is shown as style='font-size:12.0pt'> instead of with double-quotes. That should be the issue right?
    function newsupdate ($text_part, $html_part){//Login to MySQL Datebase$hostname = "localhost";$db_user = "user";$db_password = "password";$database = "database";$db_table = "table";$db = mysql_connect($hostname, $db_user, $db_password);mysql_select_db($database,$db);//These functions searches the open e-mail for the the prefix defining strings.//Need a function to search after the space after the strings because the subject, categories, snippet, tags and message are constant-changing.$subject = preg_match('/Title: (.*)/', $text_part, $matches) ? $matches[1] : '';$subject = str_replace("Title: ", "" ,$subject);        $categories = preg_match('/Categories: (.*)/', $text_part, $matches) ? $matches[1] : '';$categories = str_replace("Categories: ", "" ,$categories); $tags = preg_match('/Tags: (.*)/', $text_part, $matches) ? $matches[1] : ''; $tags = str_replace("Tags: ", "" ,$tags);          $snippet = preg_match('/Snippet: (.*)/', $text_part, $matches) ? $matches[1] : '';                 $snippet = str_replace("Snippet: ", "" ,$snippet);            // removes the snippet prefix$content = preg_match('/Message: ([^"@]*)/s', $html_part, $matches) ? $matches[1] : '';              $content = str_replace("Message: ", "" ,$content);     $when = strtotime("now");   	        	$uri = strtolower(str_replace(" ", "-", $subject));   $uri = substr($uri, 0, 20);       /*** THIS CODE WILL TELL MYSQL TO INSERT THE DATA FROM THE EMAIL INTO YOUR MYSQL TABLE ***/$sql = "INSERT INTO $db_table(`caption`,`snippet`,`content`,`when`,`uri`,`tags`,`categories`,`DATE`) values ('$subject','$snippet','$content','$when','$uri','$tags','$categories','$when')";if($result = mysql_query($sql ,$db)) {} else {echo "ERROR: ".mysql_error();}//echo "<h1>News Article added!</h1>"; //uncomment for testing purposes}    //end defining the function NewsUpdate

  16. I would still suggest using the regular expression I posted in posts 29 and 39 to convert all filenames to add your images folder to them. That's really all you need to do in terms of the regular expression. When you go through the list of attachments to save then you might not save all of the attachments and might end up with img elements that don't point to an actual file, but that's their fault for emailing a file that you don't support. It's another issue if you want to clean up the HTML afterwards to remove any img elements that don't link to actual files, but in general I would convert the filenames to something usable, save whatever attachments you think are worth saving, and that's it.
    Ok i'm going to try something and post back here. Also,
    $content = preg_match('/Message: (.*)/', $html_part, $matches) ? $matches[1] : '';   

    This only captures whats on the first line. I have set a certain amount of rules for the php script on the whole, so everything after Message: will be the body content for the news article. I want to get everything (paragrahps, images) after Message:.The regex next to it only gets whats on that line though. I modified it to this

    $content = preg_match('/Message: ([^"@]*)/', $html_part, $matches) ? $matches[1] : ''; 

    But that only gets one paragraph. I need support for multiple paragraphs, line breaks and such. Could you show me a modified regex that will do this?

  17. That pattern will only match jpg images, if that's what you want it to do, but you would need to escape the period so that it looks for a literal period instead of any character. I would recommend allowing all filenames and all image extensions (that you want to support) instead of expecting anything to be in a certain format. It's not feasible to try and build a list of all possible email clients and what rules they follow. Some email clients may not rename the images at all, so you have to assume that the filenames could literally be any valid filename.
    Could you point me in the right direction (i.e. php.net manuals) or can could you possibly draw me up some example code of what you are suggesting so i could do this? To avoid writing a setting of functions to look for based on an email client's specific image sequence naming protocol.
  18. I found out you were also right about the filename extensions comment. I assumed that all email clients name their images like Outlook does "image001.png" When i tested it with Windows Live Mail, they named their images with "Image1[2].gif", "Image2[2].gif". I'd also have to possibly try G-mail and Yahoo as well. What do you recommend so i would have to avoid building a series of if statements that looks for the X-Mailer: portion to know how to handle the images in that email?

  19. You wouldn't go through the list of attachments manually anyway, you would write a loop to loop through all of them in PHP and check each filename to decide if you want to save that file based on the extension. I'm just assuming a certain set of rules, you need to define the actual rules that you want your code to check for yourself. You can change the filename part of the pattern so that it explicitly looks for the dot and a specific extension if you want to add that requirement for the regex.
    How can i modifiy the regex to look for the specific filenames? I really don't know that much about editing it.
    #src="cid:([^"@]*)@([^"]*)"#

    EDIT:Well i edited this way

    #src="cid:([^"@]*).jpg@([^"]*)"#

    and it worked in a test preg_replace function i did. Just to be sure, is that the correct way? Should i have any reserves or precautions about just adding the filename extension smack in the middle like that?

  20. The regular expression doesn't do that, but when you're going through the list of attachments you can check each extension and only save the ones you want to.
    But now knowing what i told you about me using Cron jobs, wouldn't this still come into conflict with it? I mean how nothing will really be done manually, so i can't go through the list of attachments. You said the regex could be modified to look for specific filename extensions? I believe you posted an example of it on the previous page though it was an exact copy of what you gave me before.
  21. I should've mentioned early on, this script would be used solely for Cron jobs to periodically update the news feed whenever an email is sent and to check at a given date and time.That is why i have alot of things and goals to run these automatically without much manual editing except for the initial setup. Does this change anything i should know about?

  22. That's what this will do:
    $find = '#src="cid:([^"@]*)@([^"]*)"#';$replace = 'src="images/$1"';$html_part = preg_replace($find, $replace, $html_part);

    If the markup has this: <img src="cid:some_file.ext@01CCD11C.6AFFF2F0"> then it will be replaced to this: <img src="images/some_file.ext"> Presumably, there would be an entry in the attachments array with the filename "some_file.ext" that you would save to the images folder.

    Thanks for that, though how would the script be able to differentiate between the different image filetypes that could be inserted into the email? (.gif, .png, .jpg, .bmp) then saved to the images folder.
×
×
  • Create New...