Asked 1 month ago by MeteoricPathfinder533
Dataform SQLX: How Can I Use Multiple CREATE TEMP FUNCTION Statements in One File?
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
Asked 1 month ago by MeteoricPathfinder533
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
I'm trying to create reusable functions within a Dataform SQLX file. When I use multiple CREATE TEMP FUNCTION
statements, the job fails to execute even though running the script directly works fine.
Consider this basic example:
SQLconfig { type: "table", schema: "debug", name: "test" } CREATE TEMP FUNCTION addition(a INT64, b INT64) RETURNS INT64 AS ( a + b ); CREATE TEMP FUNCTION multiply(a INT64, b INT64) RETURNS INT64 AS ( a * b ); WITH numbers AS (SELECT 1 AS x, 5 as y UNION ALL SELECT 2 AS x, 10 as y UNION ALL SELECT 3 as x, 15 as y) SELECT x, y, addition(x, y) AS added, multiply(x, y) as multiplied FROM numbers
When running it directly, it executes fine:
However, executing the job fails immediately with the error: "Syntax error: Unexpected keyword CREATE at [31:1]"
Am I structuring the SQLX file correctly? How can I define multiple temporary UDFs within a single Dataform SQLX file without running into session or syntax issues?
I know that one workaround is to convert the TEMP UDFs to persistent functions (using CREATE OR REPLACE FUNCTION
) in separate SQLX files, but I prefer to keep everything in one file for clarity. I've also experimented with using JavaScript to generate the UDF SQL code, but it doesn't feel as clean as a pure SQL solution.
The provided answers suggest ensuring that the UDF definitions and query run in the same session, for example by combining them into a single continuous statement or using the pre_operations { ... }
block. However, while the latter approach works on job execution, it still fails when using the "Run" command.
What is the best practice for defining and using multiple TEMP UDFs in a Dataform SQLX file?
Comments 0
•Answers 2
•Views 38
Answer by TychoAI Agent • 1 month ago
The issue isn’t with BigQuery at all—your SQL is valid if run as a single script—but with how Dataform handles SQLX files.
In Dataform a “table” (or “view”) file is expected to produce one query that returns the data to be materialized. Internally Dataform concatenates and/or splits the file’s SQL into what ultimately becomes one final query. When you include multiple DDL (or UDF creation) statements separated by semicolons, Dataform’s parser ends up breaking them up into separate statements. That means your temporary UDF definitions and the final query aren’t executed in the same session, so the UDFs aren’t available when the query runs. In your case, the error “Syntax error: Unexpected keyword CREATE at [31:1]” is raised because your file now contains multiple statements, and the subsequent CREATE TEMP FUNCTION doesn’t appear in the proper position relative to the final query.
There are a couple of ways to address this:
Combine Everything Into a Single Statement
You can manually join your UDF definitions and query into one continuous SQL script so that BigQuery sees one single execution unit. For example, you can put all your CREATE TEMP FUNCTION statements at the very top and then (without a terminating semicolon on the last one) immediately follow with your query. In some cases you may need to remove extra semicolons that Dataform might otherwise insert between statements. (This may require tweaking your file so that Dataform treats it as a “script” rather than a series of statements.)
Generate the UDF SQL With JavaScript
If you’re open to using JavaScript, you can write a JS file that outputs the UDF definitions as a string, then interpolate that into your query. That way everything is rendered into one SQL string at compile time, and only one query gets executed.
Use Persistent UDFs
Although it’s a bit more “draggy,” you can create each UDF as a separate SQLX file using a persistent (non‐temporary) UDF via CREATE OR REPLACE FUNCTION. That way the functions are available to your queries separately from the table file.
Dataform’s job execution model expects one query per file. When you write multiple SQL statements—including multiple CREATE TEMP FUNCTION statements—the temporary functions and final query do not execute in a single session, causing the syntax error. The recommended solutions are to either combine your UDFs and query into one continuous statement (perhaps with help from JavaScript macros) or define the UDFs as persistent functions in separate files.
Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?
No comments yet.
Answer by VoidPathfinder953 • 1 month ago
Well, I remembered in SQLX there are pre_operations { ... }
so I experimented with this:
SQLconfig { type: "table", schema: "debug", name: "test" } pre_operations { CREATE TEMP FUNCTION addition(a INT64, b INT64) RETURNS INT64 AS ( a + b ); --- CREATE TEMP FUNCTION multiply(a INT64, b INT64) RETURNS INT64 AS ( a * b ); } WITH numbers AS (SELECT 1 AS x, 5 as y UNION ALL SELECT 2 AS x, 10 as y UNION ALL SELECT 3 AS x, 15 as y) SELECT x, y, addition(x, y) AS added, multiply(x, y) as multiplied FROM numbers
This works well when the job is executed, however it doesn't work when pressing "Run":
No comments yet.
No comments yet.