Archive for the ‘Développement’ Category

PHP: checking links validity using multi-thread processes

janvier 12th, 2012

I recently came to a situation where I had to check 1800 links validity. No chance to do it manually. I decided to write some PHP code to check these URLs using the PHP get_headers function. Checking an URL seems easy but it takes a tiny amount of time to process because it calls distant resources: DNS lookup, connection time, download time… for each request. Using a simple « while » PHP function, you may come across three issues:

  1. it may be very long to complete and it’s not always possible to increase the PHP maximum execution time
  2. the display: your web page will be blank for quite a long time until enough data is processed. Of course you may play with the buffer handler but still!
  3. the process: as PHP is not a multi-threaded language out of the box, it will check one link by one. Much longer…

Therefore, I developed a very simple peace of code to solve these issues. As it is working quite softly, I decided to share it with you.
Read the rest of this entry »

Tags: , , , , , , ,
Posted in Développement, Technologies | Comments (0)

PHP: How to extract attachments from email

janvier 4th, 2012

[EDIT 2013-01-14] the following code has been modified. There was a bug that would prevent some very particular attachments to be saved. Thanks to Byron for alerting me. Also attachments are now exported into a directory.

I’ve been busy working on a piece of code whose role is to extract all attachments from emails using PHP and IMAP. It is not really simple so here’s a recursive function doing that work:

<?php

// SETTINGS
$server		= '{imap.vidax.net:993/imap/ssl}INBOX';
$username	= 'test@vidax.net';
$password	= 'test';
$export_dir	= '/home/axel/test/'; # final slash is required
// END SETTINGS

$mbox = imap_open($server, $username, $password) or die('Unable to login');

// Getting all emails
if ($headers = imap_headers($mbox)) {
	$i = 0;
	foreach ($headers as $val) {
		$i ++;

		// Will return many infos about current email
		// Use var_dump($info) to check content
		$info	= imap_headerinfo($mbox, $i);
		$msgid	= trim($info->Msgno);

		// Gets the current email structure (including parts)
		// Use var_dump($structure) to check it out
		$structure = imap_fetchstructure($mbox, $msgid);

		// Getting attachments
		// Will return an array with all included files
		// Also works with inline attachments
		$attachments = get_attachments($structure);

		// You are now able to get attachments' raw content
		foreach ($attachments as $k => $at) {
			$filename = $export_dir.'id_'.$msgid.'_part_'.str_replace('.', '-', $at['part']).'_'.$at['filename'];
			$content = imap_fetchbody($mbox, $msgid, $at['part']);

			if ($content !== false && strlen($content) > 0 && $content != '') {
				switch ($at['encoding']) {
					case '3':
						$content = base64_decode($content);
					break;

					case '4':
						$content = quoted_printable_decode($content);
					break;
				}

				file_put_contents($filename, $content);
			}
		}
	}
}
// Shutting down
imap_close($mbox);

/**
* Gets all attachments
* Including inline images or such
* @author: Axel de Vignon
* @param $content: the email structure
* @param $part: not to be set, used for recursivity
* @return array(type, encoding, part, filename)
*
*/
function get_attachments($content, $part = null, $skip_parts = false) {
	static $results;

	// First round, emptying results
	if (is_null($part)) {
		$results = array();
	}
	else {
		// Removing first dot (.)
		if (substr($part, 0, 1) == '.') {
			$part = substr($part, 1);
		}
	}

	// Saving the current part
	$actualpart = $part;
	// Split on the "."
	$split = explode('.', $actualpart);

	// Skipping parts
	if (is_array($skip_parts)) {
		foreach ($skip_parts as $p) {
			// Removing a row off the array
			array_splice($split, $p, 1);
		}
		// Rebuilding part string
		$actualpart = implode('.', $split);
	}

	// Each time we get the RFC822 subtype, we skip
	// this part.
	if (strtolower($content->subtype) == 'rfc822') {
		// Never used before, initializing
		if (!is_array($skip_parts)) {
			$skip_parts = array();
		}
		// Adding this part into the skip list
		array_push($skip_parts, count($split));
	}

	// Checking ifdparameters
	if (isset($content->ifdparameters) && $content->ifdparameters == 1 && isset($content->dparameters) && is_array($content->dparameters)) {
		foreach ($content->dparameters as $object) {
			if (isset($object->attribute) && preg_match('~filename~i', $object->attribute)) {
				$results[] = array(
				'type'          => (isset($content->subtype)) ? $content->subtype : '',
				'encoding'      => $content->encoding,
				'part'          => empty($actualpart) ? 1 : $actualpart,
				'filename'      => $object->value
				);
			}
		}
	}

	// Checking ifparameters
	else if (isset($content->ifparameters) && $content->ifparameters == 1 && isset($content->parameters) && is_array($content->parameters)) {
		foreach ($content->parameters as $object) {
			if (isset($object->attribute) && preg_match('~name~i', $object->attribute)) {
				$results[] = array(
				'type'          => (isset($content->subtype)) ? $content->subtype : '',
				'encoding'      => $content->encoding,
				'part'          => empty($actualpart) ? 1 : $actualpart,
				'filename'      => $object->value
				);
			}
		}
	}

	// Recursivity
	if (isset($content->parts) && count($content->parts) > 0) {
		// Other parts into content
		foreach ($content->parts as $key => $parts) {
			get_attachments($parts, ($part.'.'.($key + 1)), $skip_parts);
		}
	}
	return $results;
}

Bonus tip: you may display images using their raw content data. For instance, the following HTML code will display a red dot:

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA
AAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO
9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot">

The result:
Red dot

Tags: , , , , ,
Posted in Développement, Technologies | Comments (0)

Subversion post-commit hook

mars 8th, 2010

Following my previous post, I decided to write a Perl script that will be executed on post-commit. Initially I had a Shell script to update automatically my development Apache document root every 3 min. Therefore each commit I make was visible within 3 minutes on my development site.

Read the rest of this entry »

Tags: , , ,
Posted in Développement | Comments (2)