Blog Posts

Cgiapp 1.5 released

Cgiapp 1.5 has been released; you may now download it.

This release fixes a subtle bug I hadn't encountered before; namely, when a method name or function name is passed as an argument to mode_param(), run() was receiving the requested run mode… and then attempting to process that as the mode param. The behaviour is now fixed, and is actually simpler than the previous (non-working) behaviour.

Also, on reading Chris Shiflet's paper on PHP security, I decided to reinstate the query() method. I had been using $_REQUEST to check for a run mode parameter; because this combines the $_GET, $_POST, and $_COOKIE arrays, it's considered a bit of a security risk. query() now creates a combined array of $_GET and $_POST variable ($_POST taking precedence over $_GET) and stores them in the property $_CGIAPP_REQUEST; it returns a reference to that property. run() uses that property to determine the run mode now.

Enjoy!

Continue reading...

When array_key_exists just doesn't work

I've been playing with parameter testing in my various Cgiapp classes, and one test that seemed pretty slick was the following:

if (!array_key_exists('some_string', $_REQUEST)) {
    // some error
}

Seems pretty straight-forward: $_REQUEST is an associative array, and I want to test for the existence of a key in it. Sure, I could use isset(), but it seemed… ugly, and verbose, and a waste of keystrokes, particularly when I'm using the param() method:

if (!isset($_REQUEST[$this->param('some_param')])) {
    // some error
}

However, I ran into a pitfall: when it comes to array_key_exists(), $_REQUEST isn't exactly an array. I think what's going on is that $_REQUEST is actually a superset of several other arrays — $_POST, $_GET, and $_COOKIE — and isset() has some logic to descend amongst the various keys, while array_key_exists() can only work on a single level.

Whatever the explanation, I ended up reverting a bunch of code. :-(

Continue reading...

MySQL Miscellanae

Inspired by a Slashdot book review of High Performance MySQL.

I've often suspected that I'm not a SQL guru… little things like being self taught and having virtually no resources for learning it. This has been confirmed to a large degree at work, where our DBA has taught me many tricks about databases: indexing, when to use DISTINCT, how and when to do JOINs, and the magic of TEMPORARY TABLEs. I now feel fairly competent, though far from being an expert — I certainly don't know much about how to tune a server for MySQL, or tuning MySQL for performance.

Last year around this time, we needed to replace our MySQL server, and I got handed the job of getting the data from the old one onto the new. At the time, I looked into replication, and from there discovered about binary copies of a data store. I started using this as a way to backup data, instead of periodic mysqldumps.

One thing I've often wondered since: would replication be a good way to do backups? It seems like it would, but I haven't investigated. One post on the aforementioned Slashdot article addressed this, with the following summary:

  1. Set up replication
  2. Do a locked table backup on the slave

Concise and to the point. I only wish I had a spare server on which to implement it!

Continue reading...

PHP_SELF versus SCRIPT_NAME

I've standardized my PHP programming to use the environment variable SCRIPT_NAME when I want my script to refer to itself in links and form actions. I've known that PHP_SELF has the same information, but I was more familiar with the name SCRIPT_NAME from using it in perl, and liked the feel of it more as it seems to describe the resource better (PHP_SELF could stand for the path to the PHP executable if I were to go by the name only).

However, I just noticed a post on the php.general newsgroup where somebody asked what the difference was between them. Semantically, there isn't any; they should contain the same information. However, historically and technically speaking, there is. SCRIPT_NAME is defined in the CGI 1.1 specification, and is thus a standard. However, not all web servers actually implement it, and thus it isn't necessarily portable. PHP_SELF, on the other hand, is implemented directly by PHP, and as long as you're programming in PHP, will always be present.

Guess I have some grep and sed in my future as I change a bunch of scripts…

Continue reading...

PHP: Continue processing after script aborts

Occasionally, I've needed to process a lot of information from a script, but I don't want to worry about PHP timing out or the user aborting the script (by clicking on another link or closing the window). Initially, I investigated register_shutdown_function() for this; it will fire off a process once the page finishes loading. Unfortunately, the process is still a part of the current connection, so it can be aborted in the same way as any other script (i.e., by hitting stop, closing the browser, going to a new link, etc.).

However, there's another setting initialized via a function that can override this behaviour — i.e., let the script continue running after the abort. This is ignore_user_abort(). By setting this to true, your script will continue running after the fact.

This sort of thing would be especially good for bulk uploads where the upload needs to be processed — say, for instance, a group of images or email addresses.

Continue reading...

Practical PHP Programming

In the past two days, I've seen two references to Practical PHP Programming, an online book that serves both as an introduction to programming with PHP5 and MySQL as well as a good advanced reference with many good tips.

This evening, I was browsing through the Performance chapter (chapter 18), and found a number of cool things, both for PHP and MySQL. Many were common sense things that I've been doing for a while, but which I've also seen and shaken my head at in code I've seen from others (calculating loop invariables at every iteration, not using variables passed to a function, not returning a value from a function, not using a return value from a function). Others were new and gave me pause for thought (string concatenation with the '.' operator is expensive, especially when done more than once in an operation; echo can take a comma separated list).

Some PHP myths were also dispelled, some of which I've been wondering about for awhile. For instance, the amount of comments and whitespace in PHP are not a factor in performance (and PHP caching systems will often strip them out anyways); double quotes are not more expensive than single quotes unless variable interpolation occurs.

It also has some good advice for SQL optimization, and, more importantly, MySQL server optimization. For instance, the author suggests running OPTIMIZE TABLE table; on any table that has been added/updated/deleted from to any large extent since creation; this will defrag the table and give it better performance. Use CHAR() versus VARCHAR(); VARCHAR() saves on space, but MySQL has to calculate how much space was used each time it queries in order to determine where the next field or record starts. However, if you have any variable length fields, you may as well use as many as you need — or split off variable length fields (such as a TEXT() field) into a different table in order to speed up searching. When performing JOINs, compare on numeric fields instead of character fields, and always JOIN on rows that are indexed.

I haven't read the entire book, but glancing through the TOC, there are some potential downfalls to its content:

  • It doesn't cover PhpDoc It doesn't appear to cover unit testing Limited
  • coverage of templating solutions (though they are mentioned) Limited usage of
  • PEAR. The author does mention PEAR a number of times, and often indicates that use of certain PEAR modules is preferable to using the corresponding low-level PHP calls (e.g., Mail and Mail_MIME, DB), but in the examples rarely uses them.
  • PHP-HTML-PHP… The examples I browsed all created self-contained scripts that did all HTML output. While I can appreciate this to a degree, I'd still like to see a book that shows OOP development in PHP and which creates re-usable web components in doing so. For instance, instead of creating a message board script, create a message board class that can be called from anywhere with metadata specifying the database and templates to use.

All told, there's plenty of meat in this book — I wish it were in dead tree format already so I could browse through it at my leisure, instead of in front of the computer.

Continue reading...

Get Firefox!

Those who know me know that I love linux and open source. One particular program that firmly committed me to open source software is the Mozilla project — a project that took the Netscape browser's codebase and ran with it to places I know I never anticipated when I first heard of the project.

What do I like about Mozilla? Well, for starters, and most importantly, tabbed browsing changed the way I work. What is tabbed browsing? It's the ability to have multiple tabs in a browser window, allowing you to switch between web pages without needing to switch windows.

Mozilla came out with a standalone browser a number of months back called, first Phoenix, then Firebird, and now Firefox. This standalone browser has a conservative number of basic features, which allow for a lean download — and yet, these basic features, which include tabbed browsing and disabling popups, far surpass Internet Explorer's features. And there are many extensions that you can download and integrate into the browser.

One such extension is a tabbed browsing extension that makes tabbed browsing even more useful. With it, I can choose to have any links leaving a site go to a new tab; or have bookmarks automatically load in a new tab; or group tabs and save them as bookmark folders; or drag a tab to a different location in the tabs (allowing easy grouping).

Frankly, there's few things I can find that Firefox can't do.

And, on top of that, it's not integrated into the operating system. So, if you're on Windows, that means if you use Firefox, you're less likely to end up with spyware and adware — which often is downloaded and installed by special IE components just by visiting sites — ruining your internet experience.

So, spread the word: Firefox is a speedy, featureful, SECURE alternative to Internet Explorer!

Continue reading...

Cgiapp Roadmap

I've had a few people contact me indicating interest in Cgiapp, and I've noticed a number of subscribers to the freshmeat project I've setup. In addition, we're using the library extensively at the National Gardening Association in developing our new site (the current site is using a mixture of ASP and Tango, with several newer applications using PHP). I've also been monitoring the CGI::Application mailing list. As a result of all this activity, I've decided I need to develop a roadmap for Cgiapp.

Currently, planned changes include:

  • Version 1.x series:

    • Adding a Smarty registration for stripslashes (the Smarty "function" call will be sslashes).
    • param() bugfix: currently, calling param() with no arguments simply gives you a list of parameters registered with the method, but not their values; this will be fixed.
    • error_mode() method. The CGI::Application ML brought up and implemented the idea of an error_mode() method to register an error_mode with the object (similar to run_modes()). While non-essential, it would offer a standard, built-in hook for error handling.
    • $PATH_INFO traversing. Again, on the CGI::App ML, a request was brought up for built-in support for using $PATH_INFO to determine the run mode. Basically, you would pass a parameter indicating which location in the $PATH_INFO string holds the run mode.
    • DocBook tutorials. I feel that too much information is given in the class-level documentation, and that usage tutorials need to be written. Since I'm documenting with PhpDoc and targetting PEAR, moving tutorials into DocBook is a logical step.
  • Version 2.x series:

    Yes, a Cgiapp2 is in the future. There are a few changes that are either necessitating (a) PHP5, or (b) API changes. In keeping with PEAR guidelines, I'll rename the module Cgiapp2 so as not to break applications designed for Cgiapp.

    Changes expected include:

    • Inherit from PEAR. This will allow for some built in error handling, among other things. I suspect that this will tie in with the error_mode(), and may also deprecate croak() and carp().

    • Changes to tmpl_path() and load_tmpl(). In the perl version, you would instantiate a template using load_tmpl(), assign your variables to it, and then do your fetch() on it. So, this:

      $this->tmpl_assign('var1', 'val1');
      $body = $this->load_tmpl('template.html');
      

      Becomes this:

      $tmpl = $this->load_tmpl();
      $tmpl->assign('var1', 'val1');
      $body = $tmpl->fetch('template.html');
      

      OR

      $tmpl = $this->load_tmpl('template.html');
      $tmpl->assign('var1', 'val1');
      $body = $tmpl->fetch();
      

      (Both examples assume use of Smarty.) I want to revert to this behaviour for several reasons:

      • Portability with perl. This is one area in which the PHP and perl versions differ greatly; going to the perl way makes porting classes between the two languages simpler.

      • Decoupling. The current set of template methods create an object as a parameter of the application object — which is fine, unless the template object instantiator returns an object of a different kind.

        Cons:

        • Smarty can use the same object to fill multiple templates, and the current methods make use of this. By assigning the template object locally to each method, this could be lost. HOWEVER… an easy work-around would be for load_tmpl() to create the object and store it an a parameter; subsequent calls would return the same object reference. The difficulty then would be if load_tmpl() assumed a template name would be passed. However, even in CGI::App, you decide on a template engine and design for that engine; there is never an assumption that template engines should be swappable.

        • Existing Cgiapp1 applications would need to be rewritten.

    • Plugin Architecture: The CGI::App ML has produced a ::Plugin namespace that utilizes a common plugin architecture. The way it is done in perl is through some magic of namespaces and export routines… both of which are, notably, missing from PHP.

      However, I think I may know a workaround for this, if I use PHP5: the magic __call() overloader method.

      My idea is to have plugin classes register methods that should be accessible by a Cgiapp-based class a special key in the $_GLOBALS array. Then, the __call() method would check the key for registered methods; if one is found matching a method requested, that method is called (using call_user_func()), with the Cgiapp-based object reference as the first reference. Voilá! instant plugins!

      Why do this? A library of 'standard' plugins could then be created, such as:

      • A form validation plugin
      • Alternate template engines as plugins (instead of overriding the tmpl_* methods)
      • An authorization plugin

      Since the 'exported' methods would have access to the Cgiapp object, they could even register objects or parameters with it.

If you have any requests or comments on the roadmap, please feel free to contact me.

Continue reading...

New site is up!

The new weierophinney.net/matthew/ site is now up and running!

The site has been many months in planning, and about a month or so in actual coding. I have written the site in, instead of flatfiles, PHP, so as to:

  • Allow easier updating (it includes its own content management system)
  • Include a blog for my web development and IT interests
  • Allow site searching (everything is an article or download)

I've written it using a strict MVC model, which means that I have libraries for accessing and manipulating the database; all displays are template driven (meaning I can create them with plain-old HTML); and I can create customizable applications out of various controller libraries. I've called this concoction Dragonfly.

There will be more developments coming — sitewide search comes to mind, as well as RSS feeds for the blog and downloads.

Stay Tuned!

Continue reading...

What's keeping that device in use?

Ever wonder what's keeping that device in use so you can't unmount it? It's happened to me a few times. The tool to discover this information? lsof.

Basically, you type something like: lsof /mnt/cdrom and it gives you a ps-style output detailing the PID and process of the processes that are using the cdrom. You can then go and manually stop those programs, or kill them yourself.

Continue reading...