What Does Deterministic Mean in SQL?

In SQL, a deterministic expression or function always returns the same result when given the same input. Conversely, a nondeterministic expression or function may return different results even with the same input due to factors like system time, randomization, or external data changes.

Below is a quick overview of the differences between deterministic and nondeterministic functions/expressions.

Deterministic Functions/Expressions

Here are some examples of deterministic functions and expressions. We can expect the same output every time we run these (provided we use the same input).

  • Deterministic functions: Functions that produce the same output for the same input every time.
    • Example: ABS(-10) always returns 10.
    • Example: LEN('SQL') always returns 3.
  • Deterministic expressions: These are expressions that evaluate consistently, such as mathematical operations on constants or columns.
    • Example: 5 + 10 always returns 15.
    • Example: FirstName + ' ' + LastName will always return the same result given the same input.

Nondeterministic Functions/Expressions

Here are examples of nondeterministic functions and expressions. We can expect to get a different output every time we run these with the same input. That is not to say that it will always return a different result, just that it can (and in most cases probably will).

  • Nondeterministic functions: Functions that can produce different results with the same input due to external factors.
    • Example: GETDATE() returns the current date and time.
    • Example: NEWID() generates a random number and can produce different results with each call.
  • Nondeterministic expressions: These rely on external factors (e.g., system state or current time). Even though we may use a constant in the expression, the fact that there’s a nondeterministic aspect makes it a nondeterministic expression.
    • Example: SELECT GETDATE() + 10 returns the current system date and time plus ten days, which will vary across executions.
    • Example: SELECT RAND() * 100 generates a random number and can produce different results with each call.

Code Example: Deterministic vs Nondeterministic Functions

SELECT 
    LEN('Abundance') AS "Deterministic",
    NEWID() AS "Nondeterministic"
UNION ALL
SELECT 
    LEN('Abundance') AS "Deterministic",
    NEWID() AS "Nondeterministic"
UNION ALL
SELECT 
    LEN('Abundance') AS "Deterministic",
    NEWID() AS "Nondeterministic";

Output (Sample):

Deterministic  Nondeterministic                    
-------------  ------------------------------------
9              66A2BB60-CE33-4931-A2B3-53368085BBC2
9              72AC75A5-A401-4F19-8496-C310B03D9282
9              3FB5AF2B-0868-4FB0-8DA9-BCC3AC6F8429

We can see that each time this query is executed, NEWID() returns a different value, while the deterministic function LEN() always returns the same value (for the same input).

Contexts Where Nondeterministic Functions/Expressions Are Not Allowed

Some database operations or contexts require that only deterministic functions or expressions be used. Examples include:

  • Indexing a Computed Column: In many RDBMSs (e.g., SQL Server), computed columns can be indexed only if the expression that defines the column is deterministic.
    • Example of allowed expression for an indexed computed column: ALTER TABLE Orders ADD OrderAmount AS Quantity * UnitPrice;
    • Example of disallowed expression: ALTER TABLE Orders ADD CurrentDate AS GETDATE();
  • Persisted Computed Columns: In SQL Server, computed columns marked as PERSISTED (i.e., stored on disk) must be deterministic because the value is fixed at the time of insertion.
    • Example: ALTER TABLE Employees ADD FullName AS FirstName + ' ' + LastName PERSISTED;
    • If a nondeterministic function like GETDATE() is used, SQL Server will throw an error.
  • Function-based Indexes (Oracle, SQL Server): These are indexes that use a function on a column. To create such an index, the function must be deterministic.

In summary, deterministic functions and expressions are those that consistently produce the same result for the same input, and they are often required in contexts like indexing or persisting computed columns. Nondeterministic functions, on the other hand, vary based on external factors and are disallowed in certain database operations.