SELECT is a foundational SQL clause that is mandatory to run queries. SELECT can take the form of a specific reference to a column name found within a data table, a string, integer, asterisk (*), expression, built-in function and user-defined function. When adding more than one value to the clause, a comma must be used to separate each value. While optional, it is a best practice to finish each query statement with a semi-colon
While most are familiar with using SELECT in conjunction with the
FROM clause, you can also use it without referencing a data table at all. Take for example the SQL query below:
SELECT 'Dog' , 1 + 1 , NULL , UPPER('cat');
The query contains strings, integers, expressions and built-in functions and will return a result set of:
As you can see, it returned the values as we had laid them out in the SELECT clause with the two differences appearing where arithmetic is being performed for the
1 + 1 or
2 and cat being transformed to CAT by the built-in function
UPPER(). In addition, a column header is provided in all result set, but has the same value of,
?column? for each value in our SELECT clause. This is the default header value if a name is not provided for each column. Further into this guide I will show you how you can rename the columns to fit your liking.
SELECT & FROM
A more traditional use of SELECT is pairing it with the
FROM clause. The FROM clause selects a data table that represents the population of the data set that can be queried. When a data table is selected, the SELECT can reference columns of data by referencing the column name.
Provided the data table below:
The SELECT statement of:
SELECT id, first_name, last_name FROM name;
Will return a result set of:
Which is exactly the same as the data table located above the query because we have specified each column represented in the data table. Since we are querying all columns within the data table we could also use a shorthand technique of using an asterisk, rather than typing out each of the column names. The asterisk (*) is a reserved character that when used tells the database to select all columns from the data tables included in the query. The following SELECT statement would deliver the same results as the earlier query we wrote.
SELECT * FROM name;
While I won’t cover the topic of joining tables in this guide, it should be known that you can use dot-notation in a query to identify the table name and column name (e.g.
name.id). This should be used when multiple tables are involved in a query to make it easier for people to interpret which data table the column is located in and eliminates errors that will occur in scenarios where the same column taxonomy is used across multiple data tables being queried together. In addition, you can use the asterisk to query all columns from a specific table (e.g.
As I mentioned earlier in the guide, you are able to rename columns in your query statement to fit your reporting needs and the way to do this is with the reserved word
AS or simply by using a space after the value. Both methods signify that there is an alias for the value with the value to the right of it is the user specified alias.
Taking the example before, we could rename the columns in our result set as such:
SELECT id AS name_id, first_name f_name, last_name AS last_n FROM name;
To wrap up this SELECT guide I will talk about
DISTINCT. This reserved keyword deduplicates the values in the columns for your query statement. By default, when you SELECT a column all records will appear unless filtering or limiting is applied to the query. Filtering and limiting is typically conducted through other clauses, but the easiest way to query unique records for a given column or combination of columns is with the DISTINCT keyword. Simply apply DISTINCT prior to selecting columns and the result set will only display the unique records present from the query.
Modifying the data table example previously used, let's say that we now have new records:
If we were to run the query:
SELECT DISTINCT first_name FROM name;
The results set would look like:
As you can see we now have two records, rather than the three that are present in the data table because DISTINCT deduplicated the values from the columns being selected. In this case, two records had the identical value of "steve" in the data table and DISTINCT removed all but one instance of "steve".
Reserved Word Casing
Before I end, I want to let you know that the word casing of the SELECT clause and all reserved keywords do not matter. The query will return the result as expected, but it is best practice to use all UPPERCASE WORDS when using the clause and reserved keywords.