GROUP BY Clause

GROUP BY is the first non-relational layer clause that executes. This means that it occurs after the initial result set has been generated during the query. GROUP BY behaves identically to DISTINCT in that it deduplicates the records based on the column groupings present in the query at the time of execution, but where it is different is that it allows aggregation functions and the HAVING clause to be able to be added to the query. The values that can be provided to GROUP BY are comma-separated column names or the index value of the column name. When an aggregation is included in the SELECT clause, each non-aggregated column must be included as a value in the clause either by column name or index position reference.

Given the following table:

id	first_name	last_name	department_id
1	Nanni	Saphin	4
2	Bea	McTurlough	2
3	Erryn	Chable	3
4	Flynn	Roman	1
5	Mart	Faragher	1

Querying data with GROUP BY referencing the column name and index position:

SELECT department_id
FROM employee
GROUP BY department_id

SELECT department_id
FROM employee
GROUP BY 1;

Will return the following result set:

department_id
1
2
3
4

As you can see, both the name and index position reference will return the same result set.

Querying data with DISTINCT and GROUP BY:

SELECT DISTINCT department_id
FROM employee;

SELECT department_id
FROM employee
GROUP BY 1;

Will return the following result set:

department_id
1
2
3
4

This is the identical result set as the result set displayed in the previous example and you can see that there is no difference in the results between using DISTINCT and GROUP BY in this query.

Aggregation function with GROUP BY:

SELECT department_id,
COUNT(department_id)
FROM employee
GROUP BY 1;

Will return the following result set:

department_id	count
1	2
3	1
2	1
4	1

With this query a count of the records representing each department is aggregated and then grouped by the department_id

Data

GROUP BY Clause

03/17/2022 • Reading Time: 1 Minutes

GROUP BY is the first non-relational layer SQL query clause. It creates unique data groupings and allows for aggregations to be used.

Given the following table:

📬 Subscribe to My Newsletter