SQL Nested Queries or Sub Queries with Doctrine DBAL

I recently ran into a problem with my website for which the solution was a nested query (sometimes termed a subquery). However, I use Doctrine DBAL for creating my dynamic queries, and there's no documentation on how to do them.

The problem

For my art gallery, the raw SQL looks something like the following:

SELECT
    p.filename,
    p.description,
    p.created,
    array_agg(DISTINCT t.tag) as tags
FROM
    photos p
LEFT JOIN tags t ON p.filename = t.content_id
WHERE
    p.filename = :filename
    AND t.content_type = :content_type
GROUP BY
    p.filename

I started noticing an odd issue where, post insertion of an image into my gallery, I'd get no error, but the form would not appear to have processed. On further inspection, using Z-Ray from ZendHQ, I realized that the insertion was successful, but that the redirect to view the inserted image was returning a 404. I grabbed the executed SQL from Z-Ray for retrieving the image, and it returned no rows.

I started thinking about why this image wasn't posting, when others were, and realized there was one trivial difference: I wasn't including any hashtags in my description, which meant no tags.

Hopefully you can see where this is leading.

A LEFT JOIN normally will not cause the entire query to fail if it finds no matching rows on the joined table; that's the behavior of INNER JOIN. However, if you put a condition that is based on a joined table outside the join itself, it essentially acts like an INNER JOIN, as this is now a condition of the SELECT query.

Sure enough, when I removed the LEFT JOIN and the array_agg column, I got a hit.

"Obvious" solution: move the condition

As a commenter on this post noted, the immediate solution is to move the AND t.content_type = :content_type clause to the JOIN:

SELECT
    p.filename,
    p.description,
    p.created,
    array_agg(DISTINCT t.tag) as tags
FROM
    photos p
LEFT JOIN tags t ON p.filename = t.content_id AND t.content_type = :content_type
WHERE
    p.filename = :filename
GROUP BY
    p.filename

This does work, and requires no huge changes to the DBAL query builder; I just move the condition into the joinLeft(), and carry on.

Preferred solution: nested query

The solution I chose was to do a nested query, and to aggregate those results as an array. This makes it more clear when reading the query as to the intent: I want to select all distinct tags for this image and assign them as an array to a column. I'm using PostgreSQL, so the query looks like this:

SELECT
    p.filename,
    p.description,
    p.created,
    (SELECT ARRAY(
        SELECT DISTINCT tag FROM tags WHERE p.filename = content_id AND content_type = :content_type
        )) as tags
FROM
    photos p
WHERE
    p.filename = :filename
GROUP BY
    p.filename

With this approach, if no rows are returned from the tags table, an empty array is created; otherwise an array of the tag values that match is returned.

However, I didn't know how to create this query using the Doctrine DBAL query builder.

DBAL solution

When creating a SELECT using the DBAL query builder, you do something like this:

$select = $dbal->createQueryBuilder();
$select
    ->select(
        // one argument per column to select
    )
    ->from('table_name', 't') // alias the table

The arguments to select() are expected to be strings, and any given string can be an arbitrary SQL expression.

Creating the subselect is easy; you do it like any other query:

$tags = $dbal->createQueryBuilder();
$tags
    ->select('tag')
    ->distinct()
    ->from('tags')
    ->where('content_id = p.filename')
    ->andWhere('content_type = :content_type');

Now, how do I get that into a column string for a select?

The getSQL() method of a query builder will spit out the SQL sent. Moreover, it does not replace placeholders, so even if you set a bound parameter, it won't be injected into the generated SQL.

Knowing all this, I did the following:

$tags = $dbal->createQueryBuilder();
$tags
    ->select('tag')
    ->distinct()
    ->from('tags')
    ->where('content_id = p.filename')
    ->andWhere('content_type = :content_type');

$select = $dbal->createQueryBuilder();
$select
    ->select(
        'p.filename',
        'p.description',
        'p.created',
        sprintf('(SELECT ARRAY(%s)) as tags', $tags->getSQL()),
    )
    ->from('photos', 'p')
    ->where('p.filename = :filename')
    ->groupBy('p.filename')
    ->setParameter('filename', $filename, ParameterType::STRING)
    ->setParameter('content_type', 'photo', ParameterType::STRING);

(Where ParameterType is imported from the namespace Doctrine\DBAL.)

This approach worked immediately, and generated exactly the same result as the raw SQL I had tested.

Changelog
  • 2025-03-07: clarified that the solution was targeting the DBAL query builder. DBAL can consume raw SQL as well, and does not require usage of the query builder.
  • 2025-03-10: noted that a LEFT JOIN will still work, as long as the t.content_type = :content_type condition is moved from the SELECT to the LEFT JOIN.