[postmodern-devel] :IN/:NOT-IN on an empty :SET always results in :NULL

Sat Feb 26 02:29:42 UTC 2011

Hi,

I'd like to be able to write a query similar to:

  (let ((set ...))
    (postmodern:sql
      (:select ... :from ...
       :where (:not-in ... (:set set)))))

But SET could be empty.  This results in the query:

  (SELECT ... FROM ... WHERE (... NOT IN (NULL)))

(The same happens when simply (:SET) is given.)

S-SQL compiles an empty set to "(NULL)", whether the empty set is
expressed statically with no arguments, or with a single argument
evaluating to the empty list.  The problem is that "(NULL)" does not
express the empty set.  The PostgreSQL IN and NOT IN operators (at
least for explicit sets on the right-hand side) are defined[1] to
evaluate to NULL whenever:

  * NULL is a member of the literal set on the right-hand side, and
  * The left-hand side value is not a member of the set.

(Or of course the left-hand side is NULL.)

[1] http://www.postgresql.org/docs/8.3/static/functions-comparisons.html#AEN15459

This means that (:IN x (:SET)) and (:IN x (:SET y)), for any value of
x and for y being the empty list, both always result in :NULL.  The
same goes for :NOT-IN.

   (postmodern:query (:select (:in 42 (:set))) :single)
=> :NULL

...where the result should be NIL (is 42 is a member of the empty set?
no).

   (postmodern:query (:select (:not-in 42 (:set))) :single)
== (postmodern:query (:select (:not (:in 42 (:set)))) :single)
=> :NULL

...where the result should be T (is 42 not a member of the empty set?
yes).

This is probably going to be tricky to solve.  The above part of the
PostgreSQL manual doesn't say anything about expressing an empty set
on the RHS of IN/NOT IN.  I tried looking through the grammar[2], but
it's rather unwieldy and hard for me to follow.

[2] http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/parser/gram.y;hb=HEAD

Some people[3] suggest expressing the empty set as e.g.
(SELECT 1 WHERE FALSE):

> select 42 in (select 1 where false);
 ?column? 
----------
 f
(1 row)

[3] http://groups.google.com/group/pgsql.general/msg/736022e9c06affac?hl=en

But the problem with this is that this empty set is typed:

> select 'foo' in (select 1 where false);
ERROR:  invalid input syntax for integer: "foo"

So, the tricky problem is: how does one correctly compile an empty
:SET S-SQL expression to an expression of an empty set, that works in
any context (that is, no matter what the data type of the LHS is)?

Instead, maybe some trickery can be done in the definition for :IN.
It could check to see whether the RHS is an S-SQL expression for the
empty set and, if so, result in "false" (or "true" for :NOT-IN).  This
would be ugly, but closer to correct.  (Caveat: when the LHS is null
at query time, the result has to be NULL.  So, when the RHS of :IN is
the empty set, the expander would have to return an expression which
checks to see whether the compiled LHS expression is null, and if so,
result in "null" instead of "false".)

Any thoughts on better ways?

-- 
J.P. Larocque <jpl at thoughtcrime.us>