Blog Posts
Design Ideas
I had some success last night with the My::Portal
CGI::Application
superclass I'm building — I actually got it working with CGI::Wiki::Simple
(after I debugged the latter to fix some delegation issues!). Now that I know the "proof-of-concept" works, I'm ready to start in on some other issues.
The first issue is: how can I specify different directories for different applications to search for templates, while retaining the default directory so that the superclass can build the final page? I could always simply keep all templates in a single directory and simply prefix them, but that seems inelegant, somehow. I'll need to explore how HTML::Template integration works with CGI::App.
Second, and closely related: how do I want it to look, in the end? I could see keeping the design we have — it's clean, simple, and yet somehow functionally elegant. Okay, I'm exaggerating — it's your standard three-column with header and footer. But it goes with the idea of blocks of content. I need to think about that.
I saw a design idea for a WikiWikiWeb today, though, that totally changed my ideas of how a Wiki should look. I hadn't been to Wikipedia for some time, but a Google link to Gaston Julia showed up on Slashdot as it shut down a site in Australia, and so I visited it. I like the new design — it separates out the common links needed into a nice left menu, and puts a subset of that at the top and bottom of the main column as well, using nice borders to visually separate things. I much prefer it to PhpWiki's default style, as well as to anything else I've really seen so far relating to Wiki layout.
Fun with Find
I've had occasion to need to grab a specific set of files from a large directory — most recently, I needed to grab some specific access logs from our Apache logfiles at work.
Enter find
.
I needed to get all files newer than a specific date, and with the pattern 'sitename-access_log.timestamp.gz'. I then needed to tar up these files and grab them for processing. So, here's what I did:
- The
-newer filename
tells find to locate files newer thanfilename
. - The
-regex
flag tells find to locate files matching the regular expression. The regex that find uses is a little strange, however, and didn't follow many conventions I know; for one thing, it's assumed that the pattern you write will match against the entire string, and not just a portion of it. What I ended up using was-regex '.*access_log.*gz'
, and that worked. - The
-printf
flag tells find to format the printing. This is useful when using the output of find in another program. For instance, tar likes a list of filenames… so I used-printf "%p "
, which separated each filename with a space.
I then backticked my full find statement and used it as the final argument to a tar command; voila! instant tar file with the files I need!
conditional use in perl
I've been struggling with how to use modules at runtime instead of compile time (I even wrote about this once before). I finally figured it out:
my $module = "ROX::Filer";
eval "use $module";
die "couldn't load module : $!n" if ($@);
Now I just need to figure out how to create objects from dynamic module names…!
Update: Creating objects from dynamic names is as easy as dynamically loading the module at run-time:
my $obj = $module->new();
Where's that module?
One continual pain for me with perl is when I need to try to find the location
of a specific module on my filesystem so that I can examine it myself; I end up
first having to find out what my @INC
path is, then having to dig through it
until I find the module. Fortunately, I'm not the only one; somebody
posted a solution to this problem on
Perl Monks:
Updated: The original listing presented didn't work! The following one, garnered from a comment to the original PM post, does, and is what I'm now using.
###!/usr/bin/perl -w
use strict;
use File::Spec::Functions qw/catfile/;
my @loaded = grep {
eval "require $_";
!$@ ? 1 : ($@ =~ s/(@INC contains: Q@INCE)//, warn ("Failed loading $_: $@"), 0);
} @ARGV;
my @pm = map catfile(split '::') . (/.pmz/ ? '' : '.pm'), @loaded;
print "@INC{@pm}n";
__END__
=pod
=head1 NAME
whichpm - lists real paths of specified modules
=head1 SYNOPSIS
editor `whichpm Bar`
=head1 DESCRIPTION
Analogous to the UN*X command which.
=cut
Just place it in your $PATH
and let 'er rip!
Class::DBI
I was reading a thread on the cgiapp mailing list today from several of the core
developers about developing a book on CGI::Application
. In it, several mentioned
that it might/should center around CGI::App
and a handful of oft-used modules.
One of those modules is
Class::DBI.
I took a gander at Class::DBI
over at CPAN, and it looks absolutely amazing,
and at the same time perhaps too abstract. Basically, you create a number of
packages and/or packages, one for each table you'll be using in your
application, and one to establish your basic connection. Then, each package
creates an object instance of the connection, and defines a number of
properties: the name of the table, the columns you'll be using, and then the
relations it has to other tables (
has_a( col_name => 'Package::Name'); has_many( col_name => 'Package::Name'); might_have(col_name => 'Package::Name');
) etc.
Then you use the module/packages you need in your script, and you can then use object-oriented notation to do things like insert rows, update rows, search a table, select rows, etc. And it looks fairly natural.
I like the idea of data abstraction like this. I see a couple issues, however:
- I don't like the idea of one package per table; that gets so abstract as to make development come to a stand-still, especially during initial development. However, once development is sufficiently advanced, I could see doing this, particularly for large projects; it could vastly simplify many regular DBI calls.
- I like using SQL. If I need to debug why something isn't working when I interact with the database, I want to have absolute control over the language. Abstracting the SQL means I don't have that fine-grained control that helps me debug.
So, for now, I'll stick with straight DBI…. but this is an interesting avenue to explore.
Ctrl-S and Ctrl-Q in *nix systems
I just ran into this not long ago, and wish I'd discovered it years ago. Basically, Ctrl-S
suspends a process, while Ctrl-Q
resumes it. This is useful when in g/vim
or screen
and you manage to lock up your application because you accidently hit Ctrl-S
when reaching for another key combo.
use autouse ... or not
Due to my cursory reading in the Perl Cookbook, 2nd Edition, earlier this
week, I've been investigating the use autouse
pragma, to see if it will
indeed solve my issue of wanting to use different modules based on the current
situation. Unfortunately, I cannot find any documentation on it in perldoc
.
I remember seeing something about wrapping this stuff into a BEGIN
block, but
that would require knowing certain information immediately, and I might need
the code to work through some steps before getting there.
Fortunately, this node just appeared on Perl Monks today, and I got to see other ways of doing it:
- The
if
module lets you do something likeuse if $type eq 'x', "Some::Module";
However,$type
must be known at compile time (i.e., it's based on system info or on@ARGV
); this probably wouldn't work in a web-based application. - Use
require
andimport
instead:if $type wq 'ex') { require Some::Module; Some::Module->import if Some::Module->can("import"); }
If your module doesn't export anything, you can even omit the call toimport
. - Use an
eval
:if ($type eq 'x') { eval "use Some::Module"; }
This gets around theimport
problem, but could possibly run into other compile time issues.
So, basically, I already had the tools to do the job; just needed to examine the problem more.
More CGI::App research... Try the manual!
So, I'm a bit of an idiot… it's been so long since I looked at CGI::App
, and
yet I felt I had such a grasp on it, that I overlooked the obvious step: look
at the manual!
In particular, there's a whole series of methods that are used to tailor
CGI:App
to your particular needs, and these include cgiapp_init()
,
cgiapp_prerun()
, and cgiapp_postrun()
.
- cgiapp_init() is used to perform application specific initialization
behaviour, and is called immediately before the
setup()
method. It can be used to load settings from elsewhere; if it were called only from a superclass from which other modules inherited, it would then provide common settings for all modules. - cgiapp_prerun() is called immediately before the selected run-mode. If it
were called only by your superclass, you could perform items such as
authorization or even form validation; this would then be standard for all
your applications. (You can use the
$self->prerun_mode('mode')
call to to override the selected run-mode, for instance, thus allowing you to redirect to a different mode if a user isn't permitted there.) - cgiapp_postrun() is called after the run-mode has returned its output, but before http headers have been generated or anything sent to the web browser. Again, if defined in a superclass, it means that you could then place the run-mode output in a specific place within a larger template, and even call other routines to fill in other parts of the main template. You could even check to see if certain parameters were passed to the page, and change the type of output you send back (XML, PDF, image, etc.), allowing you to have a common query element that changes the output type (e.g., a 'print' parameter that returns a PDF or a stripped down template).
In addition, you could specify in the superclass that you're using
CGI::Simple
for the query object (using the cgiapp_get_query
method), or
you could rewrite the load_tmpl()
method to use Template::Toolkit
or some
other templating system, etc.
Doesn't look so crazy anymore…
CGI::Application Research
I've been wanting to redevelop my home website for some time using
CGI::Application
. The last time I rewrote it from PHP to perl, I developed
something that was basically a subset of the things CGI::App
does, and those
things weren't done nearly as well.
The problem I've been running into has to do with having sidebar content, and
wanting to run basically a variety of applications. I want to have a
WikiWikiWeb, a photo gallery, some mail forms, and an article database/blog;
CGI::App
-based modules for each of these all exist. But I want them all to
utilize the same sidebar content, as well — and that sidebar content may vary
based on the user.
My interest got sparked by this node on
Perl Monks. The author tells of an acquaintance who goes
by the rule that a CGI::App
should have 10-12 states at most; more than that,
and you need to either break it apart or rethink your design. And all CGI::App
s
inherit from a common superclass, so that they share the same DB connections,
templates, etc.
So, I've been investigating this problem. One node on PM
notes that his ISP uses CGI::App
with hundreds of run modes spread across
many applications; they created a module for session management and access
control that calls use base CGI::Application
; each aplication then calls
use base Control
, and they all automatically have that same session
management and access, as well as CGI::Application
.
Another node mentions the
same thing, but gives a little more detail. That author writes a module per
application, each inheriting from a super class: UserManager.pm
, Survey.pm
,
RSS.pm
, Search.pm
, etc. You create an API for that super class, and each
CGI::App
utilizes that API to do its work.
This also seems to be the idea behind CheesePizza,
a CGI::App
-based framework for building applications. (All pizzas start out
as cheese pizzas; you simply add ingredients.) The problem with that, though,
is that I have to learn another framework on top of CGI::App
, instead of
intuiting my own.
But how do I write the superclass? Going back to the original node that sparked
my interest, I found a later reply that described how you
do this. The big key is that you override the print
method — this allows you
to customize the output, and from here you could call functions that create
your sidebar blocks, and output the content of the CGI::App
you just called in
a main content area of your template.
Grist for the mill…
robots.txt
One thing I've wondered about is the syntax of the robots.txt
file, where it's
placed, and how it's used. I've known that it is used to block spiders from
accessing your site, but that's about it. I've had to look into it recently
because we're offering free memberships at work, and we don't want them indexed
by search engines. I've also wondered how we can exclude certain areas, such as
where we collate our site statistics, from these engines.
As it turns out, it's really dead simple. Simply create a robots.txt
file in
your htmlroot, and the syntax is as follows:
User-agent: *
Disallow: /path/
Disallow: /path/to/file
The User-agent
can specify specific agents or the wildcard; there are so many
spiders out there, it's probably safest to simply disallow all of them. The
Disallow
line should have only one path or name, but you can have multiple
Disallow
lines, so you can exclude any number of paths or files.