What makes SQL special

Written by Chris Zetter

Published 2025-07-20

I wrote a book that guides you through building a database server that can run SQL.

But why write a book about SQL? And why use SQL as the interface to a database? Over the last 50 years, SQL has become the dominant way to communicate with databases. Here are some of the characteristics that make SQL special and have contributed to its popularity:

Relational databases

In the 1970s, the relational model was introduced. The purpose of the model was to provide an abstraction so that you could query data without knowing how it was physically stored. This was done by representing data as tables made up of rows and having a defined set of operations that can transform this data.

SQL embraced the ideas of the relational model and became the language of relational databases. There are other query languages, but none have had the popularity and longevity of SQL.

$An example table containing orders. The table is made up of an id column containing integers, a customer\_id column containing integers and a shipped column containing booleans.$

The relational model uses ideas from set theory to precisely describe relational databases and operations you can perform on them. The image above shows an ‘orders’ table made up of rows. Using the terminology from the relational model, you would call the ‘orders’ a relation made up of tuples. A tuple is a list of values, just like a row. A relation is a set of tuples with the same structure, just like a table is a set of rows.

These terms aren’t always equivalent. For example, In the relational model, tuples should be unique within a relation. In practice, rows in databases don’t usually enforce this constraint. You’re likely to see these terms in database research and in the source code of database implementations.

DATABASES IN THE WILD

Before SQL

Before SQL databases became popular, navigational databases were often used. In a navigational database you can retrieve data by writing a program that navigates between linked records.

Navigational databases were designed with specific use cases in mind. This made them fast, even on limited hardware. However, for a given structure of linked records, certain queries could be difficult to write or too inefficient to run. You could re-structure how the data is stored, but that would break programs that used the database.

To query information in navigational databases you had to write a program. This program was often written in a general purpose imperative programming language that had methods available to interact with the database.

Here’s the simplified example showing what a program might look like that retrieves records from a navigational database. In this example there is a company made up of many departments. Each department record is linked to the employee records for people that work in that department. This program prints the names of all employees:

WHILE TRUE  
  department = NEXT_RECORD('department')  
  BREAK IF EMPTY(department)  
  WHILE TRUE  
    employee = NEXT_RECORD_WITHIN(department, 'employee')  
    BREAK IF EMPTY(employee)  
    PRINT(employee.name)  
  END  
END

This program has a loop which loads through each department until it receives an empty record. For each department it finds, it loops through employees that are in that department and prints them. Due to the way the data is structured, you have to loop through departments even though you only want information about employees.

This example is based on a sample program for the Integrated Database Management System (IDMS). IDMS is a database that runs on mainframes and can be programmed using COBOL.

CA IDMS Navigational DML Programming Guide
This is a modern guide for writing programs for IDMS.

History of databases
A detailed history of databases with comparisons between models.

Designed for everyone

The authors of SQL (originally called SEQUEL) aimed to make a language that was easier to use by people who weren’t trained programmers. They did this by creating a language around the set of operations in the relational model and avoiding having to manage variables when writing queries.

“… there is also a large class of users who, while they are not computer specialists, would be willing to learn to interact with a computer in a reasonably high-level, non-procedural query language. Examples of such users are accountants, engineers, architects, and urban planners. It is for this class of users that SEQUEL is intended. For this reason, SEQUEL emphasizes simple data structures and operations.”

SEQUEL: A Structured English Query Language

There were other languages based on the relational model. QUEL is a query language used for the INGRES database that was made at a similar time to SQL. Here’s a QUEL query that computes the pay for an employee with a certain name:

RANGE OF E IS EMPLOYEE  
RETRIEVE INTO W  
(PAY = E.RATE*E.HOURS)  
WHERE E.NAME = "Jones"

And an equivalent query in SQL:

SELECT RATE*HOURS AS PAY FROM EMPLOYEE  
WHERE NAME = "Jones"  

Both SQL and QUEL allow you to perform similar operations. In QUEL you need to explicitly define variables to represent rows of the employee table, for example the E variable for the EMPLOYEE table. While programmers are used to dealing with variables like this in code, the creators of SQL thought that it would be a barrier to wider use.

A declarative language

When writing SQL, you describe the data you want using high-level operations that are based on the relational model.

This gives databases the power to choose the details of their implementations. For example, databases get to decide what sorting algorithm to use when ordering data. They also choose how to organize data they save to physical storage.

Databases might make decisions so they can specialize for certain uses. They might also have multiple implementations of the same operation and will choose the one that might be fastest for an individual query.

This is great if you want to build your own database server, since you’ll have a large amount of freedom when choosing how to execute SQL statements.

Widespread implementations

Whatever you want to do with data, there’s probably an SQL database suited to your needs.

There are transactional databases (sometimes called OLTP databases for OnLine Transaction Processing) which are designed to quickly read and write data so are well suited to responding to user actions such as creating an account or paying for a purchase. MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server are examples of transactional databases.

There are also analytical databases (sometimes called OLAP databases for OnLine Analytical Processing). These are optimized to run analytical queries across large amounts of data. Google’s BigQuery, Amazon’s RedShift, Clickhouse are all analytical databases.

Transactional and analytical databases are examples of two specializations. There are also databases that are designed to automatically scale in a cloud environment, or databases that are designed to run on resource-constrained devices.

Despite being built for different purposes, all these databases use SQL. This makes being able to write SQL and understand the concepts behind it a valuable skill.

DATABASES IN THE WILD

ISO/IEC 9075

There’s a standard for SQL called ISO/IEC 9075, Information technology - Database languages - SQL. This standard can help database implementations keep their behaviour consistent with each other, making it easier to switch between different implementations.

The standard describes the syntax and behaviour of SQL. The standard doesn’t specify lower level implementation details such as in what format data should be stored.

Database implementations don’t always follow the whole SQL standard. This might be because they disagree with specific decisions in the standard or have implemented a feature before it became standardized and want to keep backwards compatibility for their users. Implementations may be configurable to more closely follow the standard.

Unstandard behaviour

One example of where some databases don’t follow the SQL standard is the treatment of lowercase and uppercase identifiers. While the standard says that identifiers should be converted to uppercase, some databases do the opposite and convert identifiers to lowercase.

CREATE TABLE orders(id INTEGER);  
SELECT id FROM orders;  
SELECT ID FROM orders;  

The above statement will work regardless of how identifiers are treated as the identifiers in CREATE and SELECT will be converted to upper or lower case the same way.

SELECT "id" FROM orders; -- lowercase databases only  
SELECT "ID" FROM orders; -- uppercase databases only  

Quoted identifiers don’t get case converted. So the first query above will only work with databases that convert identifiers to lowercase, the second query only on ones that convert identifiers to uppercase. If you want to write portable SQL that can run on different database servers, you should avoid mixing quoted and unquoted identifiers.

The SQL Standard – ISO/IEC 9075:2023 (ANSI X3.135)
A history of the SQL standard and links to the parts of the standard.

PostgreSQL SQL Conformance
Many SQL databases have documentation explaining how they do/don’t conform to the standard. Also see documentation from Oracle, MySQL, and Apache Spark SQL.

Lexical Structure - Identifiers and Key Words
PostgreSQL documentation explaining how identifier cases are treated.

This post is adapted from the first chapter of Build a Database Server, a hands-on guide to implementing a working SQL database from scratch. Preview or buy the book to learn more about how databases work.

Sources and further reading

A relational model of data for large shared data banks
The journal article from 1970 that introduced the relational model. It explains the model and how it saves people from having to know how data is stored.

Relational Algebra
Relational algebra is a concise way of expressing operations in the relational model. This page explains the common relational operators and shows their SQL equivalent.

SEQUEL: A structured English query language
The article that introduced SEQUEL from 1974 and explained its design decisions. The name was later changed to SQL.

The Design and Implementation of INGRES
Article from 1976 that describes the INGRES database and the QUEL query language. Other databases supported QUEL too, notably earlier versions of PostgreSQL (originally POSTGRES). POSTGRES was so named as it was designed to be a successor of INGRES.

Database of Databases
A filterable collection of database implementations. The collection includes both SQL and non-SQL based databases.

What Goes Around Comes Around… And Around…
A discussion of different categories of databases and architectures. The article also gives examples of features that relational SQL databases have adopted over time.

What makes SQL special

Relational databases

Before SQL

More

Designed for everyone

A declarative language

Widespread implementations

ISO/IEC 9075

Unstandard behaviour

More

More

Sources and further reading

Start your database adventure today

About databases in the wild

Settings

Theme: