Blog Posts

mbstring comes to the rescue

I've been working with SimpleXML a fair amount lately, and have run into an issue a number of times with character encodings. Basically, if a string has a mixture of UTF-8 and non-UTF-8 characters, SimpleXML barfs, claiming the "String could not be parsed as XML."

I tried a number of solutions, hoping actually to automate it via mbstring INI settings; these schemes all failed. iconv didn't work properly. The only thing that did work was to convert the encoding to latin1 — but this wreaked havoc with actual UTF-8 characters.

Then, through a series of trial-and-error, all-or-nothing shots, I stumbled on a simple solution. Basically, I needed to take two steps:

  • Detect the current encoding of the string
  • Convert that encoding to UTF-8

which is accomplished with:

$enc = mb_detect_encoding($xml);
$xml = mb_convert_encoding($xml, 'UTF-8', $enc);

The conversion is performed even if the detected encoding is UTF-8; the conversion ensures that all characters in the string are properly encoded when done.

It's a non-intuitive solution, but it works! QED.

Continue reading...

PHP Library Channel

I've been working on Cgiapp in the past few months, in particular to introduce one possibility for a Front Controller class. To test out ideas, I've decided to port areas of my personal site to Cgiapp2 using the Front Controller. Being the programmer I am, I quickly ran into some areas where I needed some reusable code — principally for authentication and input handling.

I've been exposed to a ton of good code via PEAR, Solar, eZ components, and Zend Framework. However, I have several criteria I need met:

  • I want PHP5 code. I'm coding in PHP5, I should be able to use PHP5 libraries, not PHP4 libraries that work in PHP5 but don't take advantage of any of its features.
  • I prefer few dependencies, particularly lock-in with existing frameworks. If I want to swap out a storage container from one library and use one from another, I should be free to do so without having to write wrappers so they'll fit with the framework I've chosen. Flexibility is key.
  • Stable API. I don't want to have to change my code every few weeks or months until the code is stable.
  • I should be able to understand the internals quickly.

So what did I choose? To reinvent the wheel, of course!

To that end, I've opened a new PEAR channel that I'm calling PHLY, the PHp LibrarY, named after my blog. The name implies soaring, freedom, and perhaps a little silliness.

It is designed with the following intentions:

  • Loosely coupled; dependencies should be few, and no base class should be necessary.
  • Extendible; all classes should be easily extendible. This may be via observers, interfaces, adapters, etc. The base class should solve 80% of usage, and allow extensions to the class to fill in the remainder.
  • Designed for PHP5 and up; all classes should make use of PHP5's features.
  • Documented; all classes should minimally have excellent API-level documentation, with use cases in the class docblock.
  • Tested; all classes should have unit tests accompanying them.
  • Open source and commercial friendly; all classes should use a commercial-friendly open source license. The BSD license is one such example.

Please feel free to use this code however you will. Comments, feedback, and submissions are always welcome.

Continue reading...

Cgiapp 1.9.0 released

I released Cgiapp 1.9.0 into the wild last night. The main difference between 1.8.0 and 1.9.0 is that I completely removed the plugin system. I hadn't had any users reporting that they were using it, and, in point of fact, the overloading mechanism I was using was causing some obscure issues, particularly in the behaviour of cgiapp_postrun().

As usual, you can find more information and links to downloads at the Cgiapp site.

Continue reading...

What is Cgiapp?

After some conversations with Paul and Mike, in recent months I realized that while I often announce new releases of Cgiapp, I rarely explain what it is or why I develop it.

~~I got into trouble on the PEAR list when I tried to propose it for inclusion in that project, when I made the mistake of describing it as a framework. (This was before frameworks became all the rage on the PHP scene; PEAR developers, evidently, will not review anything that could possibly be construed or interpreted as a framework, even if it isn't.)~~ I mistakenly called Cgiapp a framework once when considering proposing it to PEAR. But if it's not a framework, what is Cgiapp? Stated simply:

Cgiapp is the Controller of a Model-View-Controller (MVC) pattern. It can be either a front controller or an application controller, though it's typically used as the latter.

Continue reading...

Telcos are Attacking the Internet

I generally try to stay out of politics on this blog, but this time something has to be said, as it affects anyone who uses the internet, at least in the US.

Basically, a number of telcos and cable providers are talking about charging internet content providers — the places you browse to on the internet, places like Google, Yahoo!, Amazon, etc. — fees to ensure bandwidth to their sites. Their argument is that these content providers are getting a 'free ride' on their lines, and generating a lot of traffic themselves, and should thus be paying for the cost of bandwidth.

This is patently ridiculous. Content providers already have to pay for their bandwidth — they, too, have ISPs or agreements with telcos in place, either explicitly or via their hosting providers. Sure, some of them, particularly search engines, send out robots in order to index or find content, but, again, they're paying for the bandwidth those robots generate. Additionally, people using the internet are typically paying for bandwidth as well, through their relationship with their ISP. What this amounts to is the telcos getting paid not just by each person to whom they provide internet access, but every end point on the internet, at least those within the US.

What this is really about is telcos wanting more money, and wanting to push their own content. As an example, let's say your ISP is AOL. AOL is part of Time Warner, and thus has ties to those media sources. Now, those media sources may put pressure on AOL to reduce bandwidth to sites operated by ABC, CBS, NBC, FOX, Disney, PBS, etc. This might mean that your kid can no longer visit the Sesame Street website reliably, because AOL has reduced the amount of bandwidth allowed to that service — but any media site in the TWC would get optimal access, so they could get to Cartoon Network. Not to slam Cartoon Network (I love it), but would you rather have your kid visiting or Basically, content providers would not need to compete based on the value of their content, but on who they can get to subscribe to their service.

Here's another idea: your ISP is MSN. You want to use Google… but MSN has limited the bandwidth to Google because it's a competitor, and won't accept any amount of money to increase that bandwidth. They do the same with Yahoo! So, now you're limited to MSN search, because that's the only one that responds reliably — regardless of whether or not you like their search results. By doing so, they've just artificially inflated the value of their search engine — without needing to compete based on merit.

Additionally, let's say Barnes and Noble has paid MSN to ensure good bandwidth, but part of that agreement is a non-compete clause. Now you find your connections to Amazon timing out, meaning that you can't even see which book provider has the better price on the book you want; you're stuck looking and buying from B&N.

Now, let's look at something a little more close to home for those of us developing web applications. There have been a number of success stories the last few years: MySpace, Digg, and Flickr all come to mind. Would these endeavors have been as successful had they needed to pay multiple times for bandwidth, once to their ISP and once each to each telco charging for content providers? Indeed, some of these are still free services — how would they ever have been able to pay the extra amounts to the telcos in the first place?

So, basically, the only winners here are the telcos.

Considering how ludicrous this scheme is, one must be thinking, isn't the US Government going to step in and regulate against such behaviour? The answer, sadly, is no. The GOP doesn't like regulation, and so they want market forces to decide. Sadly, what this will likely do is force a number of content providers to offshore their internet operations — which is likely to have some pretty negative effects on the economy.

The decision isn't final — efforts can still be made to prevent it (the above link references a Senate committee meeting; there's been no vote on it). Call your representatives today and give them an earful. Tell them it's not just about regulation of the industry, but about fair competition in the market. Allowing the telcos to extort money from content providers will only reduce the US' economic chances in the world, and stifle innovation and choice.

Continue reading...

Automating PHPUnit2 with SPL

I don't blog much any more. Much of what I work on any more is for my employer, Zend, and I don't feel at liberty to talk about it (and some of it is indeed confidential). However, I can say that I've been programming heavily on PHP5 the past few months, and had a chance to do some pretty fun stuff. Among the new things I've been able to play with are SPL and PHPUnit — and, recently, together.

Continue reading...

PHP error reporting for Perl users

On perlmonks today, a user was needing to maintain a PHP app, and wanted to know what the PHP equivalent of perl -wc was — specifically, they wanted to know how to run a PHP script from the commandline and have it display any warnings (ala perl's strict and warnings pragmas).

Unfortunately, there's not as simple a way to do this in PHP as in perl. Basically, you need to do the following:

  • To display errors:

    • In your php.ini file, set display_errors = On, or
    • In your script, add the line ini_set('display_errors', true);
  • To show notices, warnings, errors, deprecation notices:

    • In your php.ini file, set error_reporting = E_ALL | E_STRICT, or
    • In your script, add the line error_reporting(E_ALL | E_STRICT);

Alternatively, you can create a file with the lines:

error_reporting(E_ALL | E_STRICT);
ini_set('display_errors', true);

and then set the php.ini setting auto_prepend_file to the path to that file.

NOTE: do not do any of the above on a production system! PHP's error messages often reveal a lot about your applications, including file layout and potential vectors of attack. Turn display_errors off on production machines, set your error_reporting somewhat lower, and log_errors to a file so you can keep track of what's going on on your production system.

The second part of the question was how to run a PHP script on the command line. This is incredibly simple: php myscript.php. No different than any other scripting language.

You can get some good information by using some of the switches, though. -l turns the PHP interpreter into a linter, and can let you know if your code is well-formed (which doesn't necessarily preclude runtime or parse errors). -f will run the script through the parser, which can give you even more information. I typically bind these actions to keys in vim so I can check my work as I go.

If you plan on running your code solely on the commandline, add a shebang to the first line of your script: #!/path/to/php. Then make the script executable, and you're good to go. This is handy for cronjobs, or batch processing scripts.

All of this information is readily available in the PHP manual, and the commandline options are always available by passing the --help switch to the PHP executable. So, start testing your scripts already!

Continue reading...

Cgiapp dual releases

Today, I have released two versions of Cgiapp into the wild, Cgiapp 1.8.0 and Cgiapp2 2.0.0rc1.

Cgiapp 1.8.0 is a performance release. I did a complete code audit of the class, and did a number of changes to improve performance and fix some previously erratic behaviours. Additionally, I tested under both PHP4 and PHP5 to make sure that behaviour is the same in both environments.

However, Cgiapp 1.8.0 markes the last feature release of Cgiapp. I am deprecating the branch in favor of Cgiapp2.

Cgiapp2 is a PHP5-only version of Cgiapp. Some of the changes:

  • Cgiapp2 is an abstract class, with the abstract method setup(). Now it is truly non-instantiable!
  • Cgiapp2 makes extensive use of visibility operators. Key methods have been marked final, some methods are now protected, others static. See the changelog for more information.
  • Cgiapp2 is now E_STRICT compliant.
  • Cgiapp2 implements the CGI::Application 4.x series callback hook system. This is basically an observer pattern, allowing developers to register callbacks that execute at different locations in the runtime.
  • Cgiapp2 adds some extensive error and exception handling classes, including observable errors and exceptions.
  • I created a template interface. If implemented, a template engine can be plugged into the architecture at will — at the superclass, application class, and instance script level, allowing developers to mix-and-match template engines or choose whichever matches their taste, without having to rewrite application code. Three template plugins are included:

Cgiapp and Cgiapp2 are available at Sourceforge.

Keep reading for more information on the evolution of Cgiapp2.

Continue reading...

XP + Cygwin + coLinux == Productivity

I wrote earlier of my experiences using Windows XP, a move I've considered somewhat unfortunate but necessary. I've added a couple more tools to my toolbox since that have made the environment even better.

Continue reading...

Using Windows XP

Those of you who know me well know that I've been using Linux as my primary OS for many years now. At this point, for me, using Windows is like deciding I'm going to use a limited vocabulary; I can get the idea across, but not quite as well.

Due to the nature of where I work and the fact that I'm telecommuting, I've been having to maintain a dual-boot system. I use Ubuntu for my daily OS, and boot into Windows when I need to interact with people at work via Webex or Skype (we're using the new Skype beta with video, and it only works on Windows XP at this time).

This week, however, I've had to stay on Windows quite a bit — lots of impromptu conference calls and such. So, I've been customizing my environment, and been pretty pleased with the results.

Continue reading for some tips on customizing your Windows XP environment to work and feel a little more like… linux.

Continue reading...