php

You are currently browsing articles tagged php.

I was trying to figure out the best way to test the myfputcsv() function I posted yesterday. Reading the data from disc before comparing it seemed like a step where errors could creep in, but there was no obvious way round that, as the purpose of the function is to write to disc.

Then I realised I could use PHP’s (fairly) new IO streams to dump the function’s output to a temporary buffer, and read it back in for comparison. Not perfect, but it removes concerns about file mutexes, permissions, unique filenames, etc. and speeds up the tests, as they never touch disc.

It wasn’t worth building on top of PHPUnit for this… checking failure conditions can wait for another day. I just wrote a quick script that compares the output of fputcsv() and myfputcsv(). (And it worked - I’ve already fixed two errors in yesterday’s post).

This is the first time I’ve reached for php://memory. It’s obviously useful for testing code that writes to disc, but I’m now wondering where else it might come in handy.

<?php
 
require_once 'myfputcsv.php';
 
function write( $function, $array )
{
    $fp = fopen( 'php://memory', 'w+' );
    $function( $fp, $array );
    rewind( $fp );
    return fread( $fp, 1024 );
}
 
function test( $array )
{
    $fputcsv = write( 'fputcsv', $array );
    $myfputcsv = write( 'myfputcsv', $array );
    if ( assert( '$fputcsv === $myfputcsv' ) )
    {
        echo "OK" . PHP_EOL;
    }
}
 
test( array() );
test( array( "Hello", "World" ) );
test( array( "He\nllo", "World" ) );
test( array( "He\\"llo", "World" ) );
test( array( "He llo", "World" ) );
test( array( "He\tllo", "World" ) );
test( array( "He\"\"llo", "World" ) );
test( array( "He,llo", "World" ) );

I recently needed an fputcsv() with a couple of modifications (I needed fields quoted unconditionally, and more than one character in the delimiter field). I looked at a couple of versions from the man page comments, but they were buggy in one way or another, and PHP4-specific.

The function below is as close as I can get to fputcsv()’s behaviour. I’m throwing it out there in the hope that it will be useful to someone, someday. It should be easy enough to modify to suit specific requirements.

/**
 * myfputcsv()
 *
 * Mimics the observed behaviour of PHP's fputcsv()
 *
 * Requires at least PHP 5.2.0 due to reliance on __toString
 *
 * @param resource $fp valid file pointer
 * @param array $fields array of values
 * @param string $delimiter optional parameter sets the field delimiter character. Defaults to ','
 * @param string $enclosure optional parameter sets the field enclosure character. Defaults to '"'
 * @return bool the length of the written string, or FALSE on failure
 */
function myfputcsv( $fp, $fields, $delimiter = ',', $enclosure = '"' )
{
    /**
     * Validate incoming values
     *
     * Weird corner cases are checked for here, so we can mimic fputcsv() as closely
     * as possible. Eg we check whether or not an object passed as $delimiter or
     * $enclosure implements __toString
     */
    if ( !is_resource( $fp ) )
    {
        trigger_error( __FUNCTION__ . '() expects parameter 1 to be resource, ' . gettype( $fp ) . ' given', E_USER_WARNING );
        return false;
    }
    if ( !is_array( $fields ) )
    {
        trigger_error( __FUNCTION__ . '() expects parameter 2 to be array, ' . gettype( $fields ) . ' given', E_USER_WARNING );
        return false;
    }
    if ( is_object( $delimiter ) && method_exists( $delimiter, '__toString' ) )
    {
        $delimiter = ( string ) $delimiter;
    }
    if ( is_object( $enclosure ) && method_exists( $enclosure, '__toString' ) )
    {
        $enclosure = ( string ) $enclosure;
    }
    if ( $delimiter == null )
    {
        trigger_error( __FUNCTION__ . '(): delimiter must be a character', E_USER_WARNING );
        return false;
    }
    if ( $enclosure == null )
    {
        trigger_error( __FUNCTION__ . '(): enclosure must be a character', E_USER_WARNING );
        return false;
    }
    if ( !is_scalar( $delimiter ) )
    {
        trigger_error( __FUNCTION__ . '() expects parameter 3 to be string, ' . gettype( $delimiter ) . ' given', E_USER_WARNING );
        return false;
    }
    if ( !is_scalar( $enclosure ) )
    {
        trigger_error( __FUNCTION__ . '() expects parameter 4 to be string, ' . gettype( $enclosure ) . ' given', E_USER_WARNING );
        return false;
    }
    if ( strlen( $delimiter ) > 1 )
    {
        trigger_error( __FUNCTION__ . '(): delimiter must be a single character', E_USER_NOTICE );
        $delimiter = $delimiter[0];
    }
    if ( strlen( $enclosure ) > 1 )
    {
        trigger_error( __FUNCTION__ . '(): enclosure must be a single character', E_USER_NOTICE );
        $enclosure = $enclosure[0];
    }
 
    /**
     * Prepare fields for writing to file by escaping them and wrapping them
     * in $enclosure
     */
    for( $i = 0; $i < sizeof( $fields ); $i++ )
    {
        /**
         * Make a decision on whether or not to use $enclosure
         */
        $use_enclosure = false;
        if ( strpos( $fields[$i], $delimiter ) !== false )
        {
            $use_enclosure = true;
        }
        if ( strpos( $fields[$i], $enclosure ) !== false )
        {
            $use_enclosure = true;
        }
        if ( strpos( $fields[$i], "\" ) !== false )
        {
            $use_enclosure = true;
        }
        if ( strpos( $fields[$i], "\n" ) !== false )
        {
            $use_enclosure = true;
        }
        if ( strpos( $fields[$i], "\r" ) !== false )
        {
            $use_enclosure = true;
        }
        if ( strpos( $fields[$i], "\t" ) !== false )
        {
            $use_enclosure = true;
        }
        if ( strpos( $fields[$i], " " ) !== false )
        {
            $use_enclosure = true;
        }
 
        if ( $use_enclosure == true )
        {
            $fields[$i] = explode( "\$enclosure", $fields[$i] );
            for( $j = 0; $j < sizeof( $fields[$i] ); $j++ )
            {
                $fields[$i][$j] = explode( $enclosure, $fields[$i][$j] );
                $fields[$i][$j] = implode( "{$enclosure}{$enclosure}", $fields[$i][$j] );
            }
            $fields[$i] = implode( "\$enclosure", $fields[$i] );
            $fields[$i] = "{$enclosure}{$fields[$i]}{$enclosure}";
        }
    }
 
    /**
     * Write fields as a $delimiter-delimited string, and return number of
     * bytes written
     */
    return fwrite( $fp, implode( $delimiter, $fields ) . "\n" );
}

A base class is "a class from which other classes are derived".

Many OO languages have the concept of a single base class from which all other classes are explicitly or implicitly descended. For example, Ruby, Java and .NET all have Object.

It’s a very common belief that PHP implements stdClass as a base class for all objects, but this is in fact not the case:

<?php
 
class DoesNotExtend {}
 
class DoesExtend extends stdClass {}
 
$doesNotExtend = new DoesNotExtend();
$doesExtend = new DoesExtend();
 
var_dump($doesNotExtend instanceof stdClass);
var_dump($doesExtend instanceof stdClass);

Outputs:

bool(false)
bool(true)

When a language is defined by its implementation rather than a standard, it can sometimes be tricky to decide what should be considered correct behaviour and what should be considered an implementation bug.

What follows isn’t so much a PHP trick as a fix for something that really should work, but doesn’t. Although the manual implies that the behaviour described below is specific to Zend Engine 1, all my tests were performed against Zend Engine 2.2, PHP 5.2.5.

Quoting from the manual:

The Zend Engine 1, driving PHP 4, implements the static and global modifier for variables in terms of references. For example, a true global variable imported inside a function scope with the global statement actually creates a reference to the global variable. This can lead to unexpected behaviour which the following example addresses:

<?php
function test_global_ref() {
    global $obj;
    $obj = &new stdclass;
}
 
function test_global_noref() {
    global $obj;
    $obj = new stdclass;
}
 
test_global_ref();
var_dump($obj);
test_global_noref();
var_dump($obj);
?>

Executing this example will result in the following output:

NULL
object(stdClass)(0) {}

- http://uk2.php.net/static

The example above uses instances of stdClass, but attempts to assign references to scalar values, arrays or resources to global variables have the same result: the attempt to modify the global fails without error (even with error_reporting(E_ALL)), and the global retains whatever value it had before the function call.

The workaround is very simple - assign the reference via the $GLOBALS superglobal:

<?php
 
function test_global_ref()
{
    $GLOBALS['obj'] = &new stdclass;
}
 
test_global_ref();
var_dump($obj);

Outputs:

object(stdClass)#1 (0) { }

I think one of these scripts must expose an implementation bug - either assigning a reference to a global variable should work, in which case the first script should not fail, or it should not work and the second script should fail. It would be interesting to get an opinion from someone involved in language internals on how PHP should behave. In either case, it’s very hard to understand why the first script doesn’t currently cause a notice or error to be thrown when the assignment fails.

PHP Singleton

I recently had a bad deer-in-the-headlights moment over a simple Singleton pattern. In an effort to turn a negative into a positive, and to burn something new (it never occurred to me that it was necessary to implement the __clone() magic method) into my feeble memory, here’s an implementation of Singleton in PHP, along with a couple of unit tests.

Of course, Singleton is still just another name for global variable.

 
/**
 * EmptySingleton
 *
 * Does nothing, but only one of it can exist at a time
 *
 * @version $Id$
 */
class EmptySingleton
{
    private static $_self = null;
 
    /**
     * EmptySingleton::init()
     *
     * Create an instance of EmptySingleton if necessary, and return it
     *
     * @access public
     * @return EmptySingleton
     */
    public static function init()
    {
        if ( self::$_self == null )
        {
            self::$_self = new EmptySingleton();
        }
 
        return self::$_self;
    }
 
    /**
     * EmptySingleton::__construct()
     *
     * The only place this can be called from is EmptySingleton::init()
     *
     * @access private
     */
    private function __construct()
    {
    }
 
    /**
     * EmptySingleton::__clone()
     *
     * If this somehow manages to get itself called, it throws an error
     *
     * @access private
     * @throws Exception
     */
    /** @codeCoverageIgnoreStart */
    private function __clone()
    {
        throw new Exception();
    }
    /** @codeCoverageIgnoreEnd */
}
 
class EmptySingletonTest extends PHPUnit_Framework_TestCase
{
    public function testCanInit()
    {
        $this->_obj = EmptySingleton::init();
        $this->assertTrue( $this->_obj instanceof EmptySingleton );
    }
 
    public function testOnlyOneInstance()
    {
        $instance_one = EmptySingleton::init();
        $instance_two = EmptySingleton::init();
 
        $instance_one->temp = 42;
 
        $this->assertEquals( $instance_one->temp, $instance_two->temp );
    }
}
 
$suite = new PHPUnit_Framework_TestSuite( 'EmptySingletonTest' );
PHPUnit_TextUI_TestRunner::run( $suite );

Update: Poking around online, I came across this interesting singleton implementation. He’s using class variables rather than instance variables, meaning that every instance of the class shares the same state. As thread safety isn’t really an issue in PHP it will work, but I think forcing class users to avoid the constructor offers a useful hint that something unusual’s going on with that class.

From the PHP manual:

A valid variable name starts with a letter or underscore, followed by any number of letters, numbers, or underscores. As a regular expression, it would be expressed thus: '[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*’

In other words, variable names can’t begin with a numeral. However there are a couple of ways to slip an illegal variable name into the symbol table:

$var = '1';
 
$$var = 'hello world';

or

${'1'} = 'hello world';

You can prove to your own satisfaction that the variables really exist with PHP’s rather obscure compact() function:

print_r( compact( '1' ));

Please note: if you use any of this in production code - and that includes compact() - I will come to your house and beat you with something heavy.

Array notation is fine, but it can look a bit clunky when you’re working with complex structures. This is a fairly simple example, but I’m sure we’ve all dealt with worse:

 
$clientChanges['deletes'][$val['fkClient']] = $val['Total'];

Casting the array to an object allows us to use object notation (->) and makes the code more readable:

 
$val = (object) $val;
$clientChanges = (object) $clientChanges;
 
$clientChanges->deletes[$val->fkClient] = $val->Total;

You can even get away with using array functions on objects, as long as they’re just simple collections of properties:

 
$o3 = (object) array_merge( (array) $o1, (array) $o2 );

Of course, member functions won’t make it through this kind of mangling, but bizarrely, private variables do:

 
class O
{
    private $a = 4;
    var $b = 5;
    var $c = 6;
}
 
$o = new O();
 
$o = ( object ) array_reverse( ( array )$o );
 
var_dump( $o );
 
/**
 * outputs:
 *
 * object(stdClass)#2 (3) {
 *     ["c"]=> int(6)
 *     ["b"]=> int(5)
 *     ["a:private"]=> int(4)
 * }
 */

There’s a post up at PHP::Impact about directory structures for web applications. The recommendations are great for a single project on a single domain, but once you get beyond that I think a different approach is called for. First the quick summary, then the expanded text:

  • One copy of a library per server, not per project
  • Use source control and a build script, rather than manually versioning with directories
  • Put your VHosts file in your config/ directory, and link to it with an Apache Include directive
  • Multiple webroots per project allows for shared code
  • Switch configuration settings against $_SERVER['SERVER_NAME']


If you treat libraries as “children” of your project you have to maintain a different copy of each library for each project on the machine. Versioning nightmare. A better alternative is to have a single, server-wide location for common libraries, and either place this location on the include_path (our approach), or symlink it to the project’s local library folder.

Here’s the standard layout we’ve thrashed out, which is based heavily on Zend’s default layout:

project/
    application/
        controllers/
        models/
        views/
    config/
        httpd-vhosts.conf
        test.project.com.ini
        www.project.com.ini
        admin.project.com.ini
    logs/
        admin.project.com.log
        test.project.com.log
        www.project.com.log
    admin.project.com/
        index.php
    test.project.com/
        index.php
    www.project.com/
        index.php

Use source control and a build script

The key difference is that multiple webroots exist within a single project. In our example we have www (the public-facing site), admin (the control panel) and test (for unit tests). Each webroot contains a front controller, a few static files and some mod_rewrite rules. These sub-domains are held within the same project because they have so much shared code in common (models etc).

The front controller picks up $_SERVER['SERVER_NAME'] and uses that to discover/create its own "{$_SERVER['SERVER_NAME']}.ini” and "{$_SERVER['SERVER_NAME']}.log” files, meaning the build script is trivial.

The httpd-vhosts.conf file contains VHost configurations for each site, and is linked with a simple

Include /srv/project/config/httpd-vhosts.conf

in the httpd.conf. Generating the httpd-vhosts.conf file is pretty much the only thing our build script has to do, but that’s just a matter of setting paths:

<VirtualHost *:80>
    ServerAdmin admin@project.com
    DocumentRoot "/srv/project/www.project.com"
    ServerName www.project.com
    ErrorLog "logs/www.project.com-error_log"
    CustomLog "logs/www.project.com-access_log" common
    php_value include_path ".:/usr/share/zend:/srv/project"
</VirtualHost>

There are some additional links on this subject at the end of PHP::Impact post linked above.

PHP is a weakly-typed language. By that I mean that variables are assigned values without regard to variable type, and implicit conversion at runtime sorts out any conflicts. PHP will happily let you compare a string and an integer, and if the string contains something unexpected, well, you should have paid more attention to data validation.

Given that the most important feature of PHP is the shallow learning curve, and that it is deployed in an environment where everything is a string, it’s (just) possible to argue that this was a good design decision. It allows you to do things like this:

 
$n = "5";
echo 3 + $n;

without getting into lots of tiresome explicit conversion. But the implicit rules are sometimes a little too clever for their own good, and throw up oddities like this:

 
$a = 'string';
$b = 0;
 
if ( $a == true && $b == false && $a == $b )
{
    echo ( 'universe broken' );
}

What’s going on here? Lets look at these three clauses in detail:

  • ( 'string' == true ) because any non-null string evaluates to true when compared with a boolean
  • ( 0 == false ) because the integer 0 undergoes implicit conversion to boolean and evaluates to false
  • Finally, ( 'string' == 0 ) because a string is silently promoted to integer when compared with an integer. If the string is the ASCII representation of a number (eg "123"), it is assigned that value. If it doesn’t contain a number, it is assigned the value 0. So our third clause evaluates as true. Oops…

I routinely use the triple-equals operator (identical) instead of the double-equals operator (equality) now, but it feels and looks like a hack to avoid someone else’s bad design decision biting me. PHP is chock-full of these little oddities.

If you want all the gory details, the type comparison tables are tucked deep inside the PHP Manual.

Update: This post over at ycombinator contains an excellent one-line summary of what I was getting at.

Phillips and Pozidriv aren't synonyms

We all know that loose coupling is good and tight coupling is bad, so why, over the past couple of years, has the web industry gone nuts for tightly-coupled frameworks?

Since the publication of Pragmatic’s Ruby on Rails book, the industry has fallen in love with frameworks. Because Rails wrapped up several perfectly sound techniques in one easy-to-use package, people who learnt it were, almost incidentally, learning a solid web development pattern. Add a dash of silver bullet marketing, and its no surprise many of them rushed to re-implement it in their native languages. PHP in particular has spawned a ridiculous number of variations on the theme.

But the problem with RoR-derived frameworks is that, once you get past the trivial use-cases, the interdependency of the framework components and the “one-size-fits-all” ethos means you end up fighting your own tools for supremacy.

I and a couple of other developers started looking seriously at PHP frameworks last year. I was badly burnt by some legacy Smarty code four or five years ago, so although I saw the advantages of enforcing consistency and separation of concerns across a group of developers with wildly different styles, I was cheerleader for the most lightweight option - CodeIgniter.

Between us we examined CodeIgniter, Symfony and CakePHP. After prodding them for a while, we concluded that none of them were right for us - CodeIgniter was too primitive (for example it relied on an output buffering hack to embed views within views), Symfony was over-engineered, and CakePHP had poor documentation and slavishly followed some of RoRs… dumber conventions.

We originally discounted Zend Framework because it was in a pretty primitive state when we started our research. However when we went back to take another look, we realised that even in its pre-1.5 state it had some potentially very useful components. Its key virtue for us was that, despite the name, it was very clearly a loosely-coupled library rather than a tightly-coupled framework (you can get a feel for this by viewing Federico Cargnelutti’s dependency graphs for some common PHP tools).

Zend’s orthogonal design allowed us to make use of individual components without dragging the entire framework along too. Attempting to decouple most frameworks requires major surgery - to take CodeIgniter as an example, its file-backed Config class is loaded automatically by the framework; if you prefer to store config settings in some other backing store, you’re out of luck. With Zend however, its painless - not only can you back Zend_Config with any data store you want, you can abandon it entirely because none of the other components rely on it. This also gives Zend a much gentler learning curve, allowing us to get our (and other peoples’) heads round it at our own pace, without committing ourselves to building an entire commercial project on top of an unknown quantity.

With Zend you can still take advantage of some very high-level components (eg routing, ACL), but if you want to, say, dump the Table Data Gateway-inspired DB classes in favour of PDO, nothing breaks. There’s little inter-dependancy between the components. If you abandon large parts of the framework, you can even get away with replacing the whole MVC architecture with something a bit more sensible like PAC. (The new Zend_Layout component supports two-step rendering, which alleviates many of the problems with MVC, but it’s hardly an ideal solution).

In summary, if you’re searching for a framework for a non-trivial PHP project, or one that you can slowly incorporate into your existing style, please take a careful look at Zend. There are still some problems (eg the use of TDG instead of Active Record), but the loose coupling means you can avoid them most of the time.

Update: Brian recommends Kohana as a PHP5 version of CodeIgniter, with improved architecture.

« Older entries