Featured

Basics of Querying

 




 Understanding the Foundations

Standard SQL is based on the relational model, which is a mathematical model for data management and manipulation. The relational model was initially created and proposed by Edgar F. Codd in 1969. Since then, it has been explained and developed by Chris Date, Hugh Darwen, and others.

the true source for the model’s name is the mathematical concept relation. A relation in the relational model is what SQL calls a table. The two are not synonymous. SQL attempts to represent with a table: a relation has a heading and a body. The heading is a set of attributes (what SQL attempts to represent with columns), each of a given type. An attribute is identified by name and type name. The body is a set of tuples (what SQL attempts to represent with rows). Each tuple’s heading is the heading of the relation. Each value of each tuple’s attribute is of its respective type. 

Some of the most important principals to understand about SQL stem from the relational model’s core foundations—set theory and predicate logic.

The other branch of mathematics that the relational model is based on is called predicate logic. A predicate is an expression that when attributed to some object, makes a proposition either true or false. For example, “salary greater than $50,000” is a predicate. You can evaluate this predicate for a specific employee, in which case you have a proposition.

The relational model uses predicates as one of its core elements. You can enforce data integrity by using predicates. You can filter data by using predicates. You can even use predicates to define the data model itself. You first identify propositions that need to be stored in the database. Here’s an example proposition: an order with order ID 10256 was placed on July 07, 2023 by the customer with ID 45, and handled by the employee with ID 9. You then create predicates from the propositions by removing the data and keeping the heading. Remember, the heading is a set of attributes, each identified by name and type name. In this example, you have orderid INT, orderdate DATE, custid INT, and empid INT


For people with a procedural programming background, the natural way to interact with data (in a file, record set, or data reader) is with iterations. So using cursors and other iterative constructs in SQL is, in a way, an extension to what they already know. However, the correct way from the relational model’s perspective is not to interact with the rows one at a time; rather, use relational operations and return a relational result. This, in SQL, translates to writing queries

 Understanding Logical Query Processing

SQL has both logical and physical sides to it. The logical side is the conceptual interpretation of the query that explains what the correct result of the query is. The physical side is the processing of the query by the database engine.

SQL, is a declarative English-like language. In this language, declarative means you define what you want, as opposed to imperative languages that define also how to achieve what you want.


Logical Query processing phases

The main statement used to retrieve data in SQL is the SELECT statement. Following are the main query clauses specified in the order that you are supposed to type them (known as “keyed-in order”): 

1. SELECT 
2. FROM 
3. WHERE 
4. GROUP BY
5. HAVING 
6. ORDER BY 

But as mentioned, the logical query processing order, which is the conceptual interpretation order, is different. It starts with the FROM clause. Here is the logical query processing order of the six main query clauses:

 1. FROM 
2. WHERE 
3. GROUP BY 
4. HAVING 
5. SELECT 
6. ORDER BY

Each phase operates on one or more tables as inputs and returns a virtual table as output. The output table of one phase is considered the input to the next phase. 

Evaluate the FROM Clause


In the first phase, the FROM clause is evaluated. That’s where you indicate the tables you want to query and table operators like joins if applicable.

Filter Rows Based on the WHERE Clause 


The second phase filters rows based on the predicate in the WHERE clause. Only rows for which the predicate evaluates to true are returned
 

Group Rows Based on the GROUP BY Clause


 This phase defines a group for each distinct combination of values in the grouped elements from the input table. It then associates each input row to its respective group.

Filter Rows Based on the HAVING Clause 


This phase is also responsible for filtering data based on a predicate, but it is evaluated after the data has been grouped; hence, it is evaluated per group and filters groups as a whole. As is usual in SQL, the filtering predicate can evaluate to true, false, or unknown

Process the SELECT Clause 


The fifth phase is the one responsible for processing the SELECT clause. What’s interesting about it is the point in logical query processing where it gets evaluated—almost last. That’s interesting considering the fact that the SELECT clause appears first in the query.

Handle Presentation Ordering 


The sixth phase is applicable if the query has an ORDER BY clause. This phase is responsible for returning the result in a specific presentation order according to the expressions that appear in the ORDER BY list. 













Comments