Jump to content

Get Contents From E-Mail And Write It To Mysql Table


IanArcher

Recommended Posts

Right, that code would probably only look at one attachment. The first part in the message is the body, the attachments are listed as additional parts. So the first attachment would be the second part, and the second attachment would be the third part. If you're checking 2 parts then you're checking the body text and the first attachment.

  • Like 1
Link to comment
Share on other sites

Right, that code would probably only look at one attachment. The first part in the message is the body, the attachments are listed as additional parts. So the first attachment would be the second part, and the second attachment would be the third part. If you're checking 2 parts then you're checking the body text and the first attachment.
That's true, i see what your getting at. I'll work on that counting the parts in the email returned from imap_fetchstructure, then i'll have a clearer view of replacing multiple cid: images with the actual file.
Link to comment
Share on other sites

  • 2 weeks later...

coutning the parts in the email was pretty simple:

   	 $struct = imap_fetchstructure($mbox,$no);        $num_parts = sizeof($struct->parts);

It's just the issue where some email clients give their attachments with the same extension the same filename, for e.g. image.jpegBecause when i test with an email with multiple attachments and have them saved to a directory, only one file gets saved.If been researching a way to possibly modify the MIME header if the value of a name of filename attribute is already present in the email, but that's an issue because the attachment part on the script is in an incrementing for loop that looks through each part one at a time. I simply want to rename name/filename value with a _number if it already exists in the email mime headers, any ideas?

      // PARAMETERS      // get all parameters, like charset, filenames of attachments, etc.      $params = array();      if ($p->ifparameters)        foreach ($p->parameters as $x)          $params[ strtolower( $x->attribute ) ] = $x->value;      if ($p->ifdparameters)        foreach ($p->dparameters as $x)          $params[ strtolower( $x->attribute ) ] = $x->value;      // ATTACHMENT      // Any part with a filename is an attachment,      // so an attached text file (type 0) is not mistaken as the message.      if (!empty($params['filename']) || !empty($params['name'])) {        // filename may be given as 'Filename' or 'Name' or both        $filename = (!empty($params['filename']))? $params['filename'] : $params['name'];        // filename may be encoded, so see imap_mime_header_decode()

Link to comment
Share on other sites

You can use the file_exists function to check if a file already exists and rename the new one if so. http://www.php.net/m...file-exists.php
I know it exists somewhere along that lines of using the function, but i mean, during testing, i had an email with 3 .jpeg images sent from Gmail and in the MIME Headers they are all named image.jpeg, and when it came down to saving them only one got saved, so renaming must take place before saving. Edited by IanArcher
Link to comment
Share on other sites

in other words, for each attachment, use file_exists on the filename first, and if it does, then change the name of the current file you are trying to save. i.e. image.jpeg, then image2.jpg, image3.jpg, or something along those lines

Link to comment
Share on other sites

in other words, for each attachment, use file_exists on the filename first, and if it does, then change the name of the current file you are trying to save. i.e. image.jpeg, then image2.jpg, image3.jpg, or something along those lines
Images are saved using:
    // process attachments    foreach ($attachments as $imageFilename => $data)    {        // places all attachments in the directory defined by $aSubject      file_put_contents("$aSubject/" . $imageFilename, $data);         }

Will that be an issue or just follow what you suggested?

Link to comment
Share on other sites

if you don't want them to overwrite it each other, then yes, you need to something to prevent that. You have to check the name, and then change the name before saving it, if the name already exists.

Link to comment
Share on other sites

err i think the script i'm using my be what you need it works fine apart from attachments which is why i posted in here http://w3schools.inv...showtopic=42926
the majority (most of the end) of this thread has been about saving attachments (images at least), so I don't think that would be super useful to use something that didn't really support attachments. As it stands, the OP has made significant progress in writing his own code for this, that appears to work well with multiple mail clients, and is just in the process of ironing out a couple of bugs. I think he would prefer to stay the course as it were, since he is nearly there in completing all the work he started out on. Edited by thescientist
  • Like 1
Link to comment
Share on other sites

Another realization i had is that, because the script i'm using to replace the src attribute's value of an image tag with that of the filename for the CID element in the MIME headers, once a filename is renamed, the replace will be useless if the file that has been saved has been renamed. So the image tag's src won't be pointing to an actual file anymore

Edited by IanArcher
Link to comment
Share on other sites

And that's why I suggested you change the name before you do anything else, including saving the file, or replacing the text. The code you showed has a loop where the only thing it does is save all attachments. I think I suggested that you use one loop for the entire thing, the loop to do all of the "processing" of attachments, where "processing" means saving the file and replacing the HTML. If you have one loop where you save the attachments, and another loop where you change the HTML, then you're right, you won't know which file names changed. Those two operations should happen in the same loop. The loop should get the header information, validate the filename and change it if necessary, then save the file and update the HTML.

Link to comment
Share on other sites

I wrote something to deal with filenames in the MIME headers that causeuse individual names like that, which can cause overwriting, no where i need to save the attachment with that filename that has been modified is the issue. This is somehwere earlier in the script where attachments and filenames for them are processed:

   // DECODE DATA   $data = ($partno)?  imap_qprint(imap_fetchbody($mbox,$mid,$partno)):  // multipart  imap_body($mbox,$mid);  // not multipart   // Any part may be encoded, even plain text messages, so check everything.   if ($p->encoding==4)  $data = quoted_printable_decode($data);   elseif ($p->encoding==3)  $data = base64_decode($data);   // no need to decode 7-bit, 8-bit, or binary   // PARAMETERS   // get all parameters, like charset, filenames of attachments, etc.   $params = array();   if ($p->ifparameters)  foreach ($p->parameters as $x)	$params[ strtolower( $x->attribute ) ] = $x->value;   if ($p->ifdparameters)  foreach ($p->dparameters as $x)	$params[ strtolower( $x->attribute ) ] = $x->value;   // ATTACHMENT   // Any part with a filename is an attachment,   // so an attached text file (type 0) is not mistaken as the message.   if (!empty($params['filename']) || !empty($params['name'])) {  // filename may be given as 'Filename' or 'Name' or both  $filename = (!empty($params['filename']))? $params['filename'] : $params['name'];  // filename may be encoded, so see imap_mime_header_decode()  $attachments[$filename] = $data;  // this is a problem if two files have same name

This is where i change the filename if it is just image.jpg|png|bmp or whatever popular extension

   if ($fname != '' && $cid != '')  $images[$cid] = $fname;  // map the $cid to the $image to produce the image filename  $ifname = $fname;  //while(){		if($f = preg_match('/(filename|name)="(image.(png|jpg|jpeg|gif|bmp))"/i', $mimeheaders, $plainfile)){   echo '<br>This is a dangerous filename match that could result in file overwriting and must be numbered sequentially<br />';  if( $ifname === $plainfile[2] )  {   echo 'This is the right filename<br>';   echo $plainfile[2] . "<br>";	if ($pos = strrpos($ifname, '.'))   {		$n = substr($ifname, 0, $pos);		$ext = substr($ifname, $pos);	 echo $ifname = $n . "_" . $i . $ext."<br />";	echo "<a style='font-size:18px; font-weight: bold'>$ifname</a><br />";		 }	 }	}

But this is where it gets sort of tricky, the below code is in an include file that will save the attachments/images and create a directory to hold them but will only be included only based upon certain validation criterion.

// searches the plain text portion of the email for the Title: and its corresponding text.$aSubject = preg_match('/Title:\s+(.*)/i', $text_part, $matches) ? $matches[1] : '';// trim Title: string from the subject, and keeps the group matched corresponding text$aSubject = str_replace("Title: ", "" ,$aSubject);// removes special characters from article/directory name and makes all characters lowercase$stripped = trim(strtolower(preg_replace("/[^A-Za-z0-9 ]/", "", $aSubject)));$aSubject = str_replace(" ", "_", $stripped); // the directory will be created here in the /newsfrommail/ application folder$dirname = "../newsfrommail/$aSubject/";// creates the directory$dir = mkdir($dirname, 0777);// process attachmentsforeach ($attachments as $filename => $data){  // places all attachments in the directory defined by $aSubject   file_put_contents("$aSubject/".$filename, $data);}

I must change the filename somehow to that of $ifname so the attachment data gets saved to the modified filename and it isn't as simple as changing $filename to $ifname. Any ideas?

Edited by IanArcher
Link to comment
Share on other sites

All of the attachments need to be saved with their original data like the CID so that you can figure out which is which. So instead of this line: $attachments[$filename] = $data; It should make the $attachments array be a numbered array of arrays for each file. So that could be like this: $attachments[] = array('filename' => $filename, 'data' => $data, 'params' => $params); The params item would be all of the parameters that it got from the headers for that part, so you can look up the CID in the params data for the attachment. The downside is that you'll need to loop through the attachments array to check each filename and list of params instead of being able to look them up by name in the array. In the other loop you showed you're setting the value in the $images array too soon. You're saving the original filename in that array, then changing the filename. The changed filename should be saved in the array, that array is what you're supposed to use to figure out which filename to replace each CID with. It needs to contain the filename that was actually saved on the server.

Link to comment
Share on other sites

How could i start to write something like that? Any specific functions>? Because this sounds similar to the example code you gave for the

$fname = ''; $cid = '';

Edited by IanArcher
Link to comment
Share on other sites

If the attachments array is a multidimensional array like that, then when you are processing the headers to build the array of image names and CIDs then for each CID you find you need to loop through the array and check the params array for the CID that you're looking for. The loop that saves the attachments would need to check if each filename already existed and rename the current one if so. If the attachments array now contains the CID in the params array then you don't need to loop through the headers to match up the CID with the filenames, both of those are in the attachments array. So you can run the code to process the email into parts, then the loop to save each attachment, then the loop to update the HTML with the filenames. The last loop could be combined with the one before it so that it saves the attachments, updates the filename, and updates the HTML.

Link to comment
Share on other sites

Wow to be honest, i really have no idea where to start off writing something for that. Again, not asking to be spoonfed, but maybe some example code please?

Link to comment
Share on other sites

Well, you start at the beginning. First you need to change how the attachments array gets built, so you do the change I showed in post 115. Then you change the loop that goes through the array, so instead of this: foreach ($attachments as $filename => $data) you do something like this: foreach ($attachments as $attachment) and then $attachment is the array of attachment information, it will have elements called filename, data, and params. You can print it out to see what it contains if you want to:

foreach ($attachments as $attachment){  print_r($attachment);}

$attachment['filename'] and $attachment['data'] will be set to what $filename and $data were set to in the old code, but $attachment['params'] will contain additional information about the file (including the CID). If multiple files have the same name that won't be a problem because the array is no longer indexed by filename. So, the next step is to get the filename and CID of the attachment from the attachments array, and then do the validation on the filename. It's hard to say how to get the CID without seeing example output from the code above that prints the array. Once you have the filename and CID then you validate the filename like you did before, although it's probably better to use file_exists to check for any file that already exists instead of using a regular expression to only (and always) rename files that start with "image". The algorithm to write a loop that will keep changing the filename until it gets to one that doesn't exist we can save for later. Once you have the unique filename then you save the file using that filename, and since you have both the filename and the CID then you can use the regular expression to update the HTML code. All of this happens in the same loop, the foreach loop that goes through the attachments that I showed above with just the print statement where all this other code gets added to. It will actually simplify your code quite a bit, the goal of this code is to get the filename and CID, save the file, and update the HTML code, and all of that can be done in relatively few lines inside one loop. Much of this code is already written, it's just scattered around.

Link to comment
Share on other sites

  • 2 weeks later...

I managed to get around this and successfully deal with this image looping & image naming process by using a recently published class found here:http://www.electrict...ge-attachments/ and followed up that class by using it as an include in the script that creates variables such as $text_part, $html_part, and wrote the following:

	/****	 The below sequence manages IMAGE Tags in E-mail are supplied by direct URL links	  ****/	if (preg_match_all('/<img\s+([^>]*)src="http(s)?:\/\/(.*)"/i', $emailMessage->bodyHTML, $matches)) {	   echo '<br />Detected E-mail Client uses URL links for image tags...continuing News from Mail session<br />';		} 	/****	 The below sequence manages IMAGE Tags in E-mail are supplied by attachments	  ****/	if (preg_match_all('/src="cid:(.*)"/Uims', $emailMessage->bodyHTML, $matches)) { 		if(count($matches)) {			$search = array();			$replace = array(); 			$num_img =  sizeof($emailMessage->attachments);						for($i = 1; $i <= $num_img; $i++){					foreach($matches[1] as $match) {							$matchExt = $emailMessage->attachments[$match]['subtype'];				$fnameExt = (string)$matchExt;				$fnameExt = strtolower($fnameExt);				$uniqueFilename = "image$i.$fnameExt";				$i++;				file_put_contents("$dirname/$uniqueFilename", $emailMessage->attachments[$match]['data']);				$search[] = "src=\"cid:$match\"";				$replace[] = "src=\"$dirname/$uniqueFilename\"";				}					$emailMessage->bodyHTML = str_ireplace($search, $replace, $emailMessage->bodyHTML);				}			}		} 	/****	 The below sequence manages IMAGE Tags in E-mail are supplied by inline image data	  ****/ 	if(preg_match_all('/src="data:(image\/(.*);.*,(.*))"/Uims', $emailMessage->bodyHTML, $basematches)) {		$img_num =  sizeof($basematches[2]);		if(count($basematches[3])) {				$search = array();		$replace = array();				for($i = 1; $i <= $img_num; $i++) {					foreach($basematches[3] as $base64data) { 				$uniqueFilename = "$dirname/image".$i.".png";				$i++;				$base64data_decoded = base64_decode($base64data);				$imCreate = imagecreatefromstring($base64data_decoded);				if ($imCreate !== false) {					imagepng($imCreate, $uniqueFilename);					imagedestroy($imCreate);				}			$search[] = "$base64data\"";			$replace[] = "$uniqueFilename\"";		}		$emailMessage->bodyHTML = str_ireplace($search, $replace, $emailMessage->bodyHTML);		$emailMessage->bodyHTML = preg_replace("/data:image\/(jpeg|png|bmp|gif);base64,/i", "", $emailMessage->bodyHTML);	  }	}}

I am kind of worried about using all that regex to parse this HTML as opposed to finding a DOMDocument solution, but through my research i haven't seen anything that DOMDocuments can do to that length as opposed to PCRE functions. The best one can do is include as many 'possible modifiers' as you can. I'm just trying to get around another bug where Apple iPhone's standard Mail app splits messages in different parts because of inline attachments, i proved this with some testing.An example:

  --Apple-Mail-1-577088534Content-Transfer-Encoding: quoted-printableContent-Type: text/plain; charset="us-ascii" Title: iPhone Email=0A=Snippet: iPhone Test=0A=Tags: iphone=2Cemail=2Cpictures=0A=Category: Member=0A=Message:=0A=Brothers and sisters have I none but this man's father is my father's son.==0A==0A= --Apple-Mail-1-577088534Content-Disposition: inline; filename="image.png"Content-Type: image/png; name="image.png"Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAUAAAAHgCAYAAADUjLREAAAgAElEQVR4AeydB3xUVfbHTzpJCITemwgIWEBRFARBV117WxVde11X176WXeuuZdUt6t+1u4oFe8MK9opiQRAUadJ7S0hIz/zP94YbXh5v --Apple-Mail-1-577088534Content-Transfer-Encoding: quoted-printableContent-Type: text/plain; charset="us-ascii" =0A==0A=Brothers and sisters have I none but this man's father is my father's son.==0A= --Apple-Mail-1-577088534Content-Disposition: inline; filename="image.jpeg"Content-Type: image/jpeg; name="image.jpeg"Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQAAAQABAAD/4gxYSUNDX1BST0ZJTEUAAQEAAAxITGlubwIQAABtbnRyUkdCIFhZWiAHzgACAAkABgAxAABhY3NwTVNGVAAAAABJRUMgc1JHQgAAAAAAAAAAAAAAAQAA9tYAAQAA --Apple-Mail-1-577088534Content-Transfer-Encoding: quoted-printableContent-Type: text/plain; charset="us-ascii" Brothers and sisters have I none but this man's father is my father's son.==0A= --Apple-Mail-1-577088534Content-Disposition: inline; filename="image.jpeg"Content-Type: image/jpeg; name="image.jpeg"Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQEAYABgAAD/4gxYSUNDX1BST0ZJTEUAAQEAAAxITGlubwIQAABtbnRyUkdCIFhZWiAHzgACAAkABgAxAABhY3NwTVNGVAAAAABJRUMgc1JHQgAAAAAAAAAAAAAAAQAA9tYAAQAA --Apple-Mail-1-577088534Content-Transfer-Encoding: quoted-printableContent-Type: text/plain; charset="us-ascii" Brothers and sisters have I none but this man's father is my father's son.==0A= --Apple-Mail-1-577088534Content-Disposition: inline; filename="image.jpeg"Content-Type: image/jpeg; name="image.jpeg"Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQEASABIAAD/4QI2RXhpZgAATU0AKgAAAAgACgEPAAIAAAAGAAAAhgEQAAIAAAALAAAAjAESAAMAAAABAAYAAAEaAAUAAAABAAAAmAEbAAUAAAABAAAAoAEoAAMAAAABAAIAAAEx --Apple-Mail-1-577088534Content-Transfer-Encoding: quoted-printableContent-Type: text/plain; charset="us-ascii" Brothers and sisters have I none but this man's father is my father's son.==0A==0A=Brothers and sisters have I none Inline image:= --Apple-Mail-1-577088534Content-Disposition: inline; filename="image.jpeg"Content-Type: image/jpeg; name="image.jpeg"Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQEASABIAAD/4QI2RXhpZgAATU0AKgAAAAgACgEPAAIAAAAGAAAAhgEQAAIAAAALAAAAjAESAAMAAAABAAYAAAEaAAUAAAABAAAAmAEbAAUAAAABAAAAoAEoAAMAAAABAAIAAAEx --Apple-Mail-1-577088534Content-Transfer-Encoding: quoted-printableContent-Type: text/plain; charset="us-ascii" but this man's father is my father's son.= --Apple-Mail-1-577088534--

The email contained multiple lines of the same words where as i just inserted pictures amongst them, and also concluded that the client doesn't use <img> tags, (EXCEPT for when forwarding process, then the client uses <img> tags).

Edited by IanArcher
Link to comment
Share on other sites

The email format that Apple uses, with inline image sections.
Agreed, also Apple iPhone Mail doesn't use text/html Content, it uses text/plain. No matter where you add a photo into the body as well, no image tag will show up. Just the photo's Content headers. The most logical thing to do would be just to extract the plain text from the body. I considered writing a script to create img tags where those photo insertions should be, but that would require appending the parts together somehow, set a default image size say 400x350 (But that presents even more complications), and changing the header to text/html. But that literally seems near impossible to do and definitely not logical.
Link to comment
Share on other sites

  • 2 years later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...