What Is a CTE?

CTE is defined as a Common Table Expression. You create a CTE using a WITH query, then reference it within a SELECT, INSERT, UPDATE, or DELETE statement.

It goes by a few names:

  • Common Table Expression
  • Subquery Factoring
  • SQL WITH Clause

Letโ€™s say you have a table called schools with the columns school_idschool_namedistrict_id, and the number of students. You need to write a query to display a list of schools alongside their district ID and the average number of students per school in that district.

Your logic might be as follows:

  1. Create a table with the list of districts and the corresponding average number of students per school.
  2. Join this table with the list of schools and display the required information.
  3. Drop the table with the average number of students per school for each district.

If you use a CTE, you donโ€™t need to create and drop a table.ย You can simply reference the temporary result set created by the WITH query, like you see below:

So, here you start by defining the temporary result setย avg_studentsย in the WITH query. In the brackets, there is a SELECT statement that defines this result set; it contains a list of districts and the corresponding average number of students per school. After the closing parenthesis, you start the main SELECT statement. Notice that you reference the temporary result set just like an ordinary table, using its assigned name (avg_students). The output will include the school name, the ID of its district, and the average number of students in this district.

CTEs in Action

Letโ€™s start by exploring the data. Suppose you are a data analyst for a retail bank and you want to analyze the bonuses given to employees last month. The following table is the starting point:

Bonus_jan

Now letโ€™s say you want to see the bonus paid to each employee along with the average bonus for their position. To this end, you first need to calculate the average bonus for every position. This can be done in a temporary result set (a CTE). The whole query will look like this:

As you can see, we start by defining the temporary result setย avg_position. Then thereโ€™s the main SELECT statement, where you join theย bonus_janย table with the temporary result setย avg_positionย to display information about each employee, their bonus, and the average bonus for that position:

Why Use a CTE?

Now, you may have noticed that in most cases, you could use one or more subqueries instead of CTEs. So why use a CTE?

  • Common Table Expressions better organize long queries. Multiple subqueries often look messy.
  • CTEs also make a query more readable, as you have a name for each of the Common Table Expressions used in a query.
  • CTEs organize the query so that it better reflects human logic. With CTEs, you start by defining the temporary result set(s) and then refer to it/them in the main query. With subqueries, you start with the main query and then place the subqueries in the middle of the query.
  • Finally, there is also a specific category of CTEs called recursive CTEs that are allowed to reference themselves. These CTEs can solve problems that cannot be addressed with other queries. Recursive queries are especially useful in working with hierarchical data.

Leave a Comment