The Language of Databases
SQL stands for Structured Query Language. It's a programming language specifically designed to manage, query, and manipulate data stored in relational databases.
Think of a database as a highly organized collection of spreadsheets, and SQL as the language you use to ask questions about them, add new data, or change existing data.
π― Why Learn SQL?
SQL is one of the most in-demand skills in tech. It's used by data analysts, developers, product managers, marketers, and virtually anyone who works with data.
Data Analysis
Extract insights from millions of rows in seconds
Backend Dev
Power applications with persistent data storage
Machine Learning
Prepare and clean datasets for models
Business Intel
Generate reports and dashboards from live data
How SQL Works
You write a query β a statement telling the database what you want. The database processes it and returns results.
-- Your very first SQL query!
SELECT 'Hello, SQL World!' AS greeting;
Result: This returns a single row with a column called greeting containing the text "Hello, SQL World!"
Popular SQL Databases
π PostgreSQL
Open source, feature-rich, enterprise-grade
π¬ MySQL
Most popular open-source DB, powers the web
πͺΆ SQLite
Lightweight, file-based, great for learning
π’ SQL Server
Microsoft's enterprise database solution
Good news: Core SQL syntax is the same across all databases. Learn it once, use it everywhere. Minor syntax differences exist, but the fundamentals are universal.
π§ Quick Check
What does SQL stand for?
The Structure of a Database
A relational database organizes data into tables. Each table represents one type of entity (like users, products, or orders).
π Anatomy of a Table
Columns define WHAT data is stored (name, age, email). Rows represent individual records (one user, one product). Each cell holds a single value.
Key Terminology
ποΈ Table
A collection of related data organized in rows and columns (like a spreadsheet tab)
π Column (Field)
A vertical set of data of the same type β like "name" or "salary"
π Row (Record)
A horizontal entry β one complete item (one employee, one product)
π Primary Key (PK)
A unique identifier for each row β usually an ID number
Data Types
Every column has a data type that defines what kind of values it can hold:
| Type | Description | Example |
|---|---|---|
INTEGER | Whole numbers | 42, -7, 1000 |
VARCHAR(n) | Text up to n characters | 'Alice', 'Hello' |
TEXT | Unlimited length text | Long descriptions |
DECIMAL | Exact decimal numbers | 99.95, 3.14 |
DATE | Calendar date | '2025-01-15' |
BOOLEAN | True/False | TRUE, FALSE |
TIMESTAMP | Date + time | '2025-01-15 14:30:00' |
Note: Text values in SQL are always wrapped in single quotes: 'like this'. Numbers don't need quotes. This is a common source of errors!
π§ Quick Check
In a database table, what does a ROW represent?
The SELECT Statement
SELECT is how you retrieve data from a database. It's the command you'll use most often.
SELECT column1, column2, ...
FROM table_name;
Select All Columns
Use * (asterisk) to get every column:
-- Get everything from the employees table
SELECT * FROM employees;
| id | name | department | salary |
|---|---|---|---|
| 1 | Alice | Engineering | 95000 |
| 2 | Bob | Marketing | 72000 |
| 3 | Carol | Engineering | 105000 |
| 4 | Dave | Sales | 68000 |
Select Specific Columns
-- Only get names and salaries
SELECT name, salary FROM employees;
Aliases with AS
Rename columns in the output using AS:
SELECT
name AS employee_name,
salary AS annual_salary,
salary / 12 AS monthly_salary
FROM employees;
DISTINCT β Remove Duplicates
-- Get unique department names only
SELECT DISTINCT department FROM employees;
Pro Tip: Avoid using SELECT * in production. Always specify the columns you need β it's faster and makes your intent clear.
Write a SELECT Query
Write a query to select only the name and department columns from the employees table.
The WHERE Clause
WHERE filters rows based on conditions. Only rows that meet the condition are returned.
SELECT * FROM employees
WHERE department = 'Engineering';
| id | name | department | salary |
|---|---|---|---|
| 1 | Alice | Engineering | 95000 |
| 3 | Carol | Engineering | 105000 |
Comparison Operators
| Operator | Meaning | Example |
|---|---|---|
= | Equal to | salary = 95000 |
!= or <> | Not equal to | department != 'Sales' |
> | Greater than | salary > 80000 |
< | Less than | salary < 80000 |
>= | Greater or equal | salary >= 95000 |
<= | Less or equal | salary <= 72000 |
Logical Operators
AND β Both conditions must be true
SELECT * FROM employees
WHERE department = 'Engineering'
AND salary > 100000;
OR β At least one condition must be true
SELECT * FROM employees
WHERE department = 'Sales'
OR department = 'Marketing';
NOT β Negate a condition
SELECT * FROM employees
WHERE NOT department = 'Sales';
Special Operators
BETWEEN β Range check
SELECT * FROM employees
WHERE salary BETWEEN 70000 AND 100000;
IN β Match any in a list
SELECT * FROM employees
WHERE department IN ('Engineering', 'Marketing');
LIKE β Pattern matching
-- Names starting with 'A'
SELECT * FROM employees
WHERE name LIKE 'A%';
-- Names containing 'ol'
SELECT * FROM employees
WHERE name LIKE '%ol%';
LIKE Wildcards: % matches zero or more characters. _ matches exactly one character. So '_ob' matches "Bob" but not "Jacob".
IS NULL β Check for missing values
-- Find employees with no department assigned
SELECT * FROM employees
WHERE department IS NULL;
Common Mistake: Never use = NULL. Always use IS NULL or IS NOT NULL. NULL is not a value β it's the absence of a value.
Filter Employees
Write a query to find all employees with a salary greater than 80000.
ORDER BY β Sort Results
By default, SQL doesn't guarantee any particular order. Use ORDER BY to sort.
-- Sort by salary, highest first
SELECT * FROM employees
ORDER BY salary DESC;
-- Sort by name alphabetically (default is ASC)
SELECT * FROM employees
ORDER BY name ASC;
β¬οΈ ASC vs DESC
ASC (ascending) = AβZ, 1β100, oldestβnewest. This is the default.
DESC (descending) = ZβA, 100β1, newestβoldest.
Multi-Column Sorting
-- Sort by department first, then salary within each dept
SELECT * FROM employees
ORDER BY department ASC, salary DESC;
LIMIT β Restrict Row Count
-- Get the top 2 highest-paid employees
SELECT * FROM employees
ORDER BY salary DESC
LIMIT 2;
OFFSET β Skip Rows (Pagination)
-- Skip first 2, then get next 2 (page 2)
SELECT * FROM employees
ORDER BY id
LIMIT 2 OFFSET 2;
Execution Order: SQL processes clauses in this order β FROM β WHERE β SELECT β ORDER BY β LIMIT. The visual above shows the logical flow.
Top Earner
Write a query to get the single highest-paid employee's name and salary.
Aggregate Functions
These functions compute a single value from a set of rows:
COUNT()
Number of rows
SUM()
Total of values
AVG()
Average value
MIN()
Smallest value
-- Count all employees
SELECT COUNT(*) AS total_employees FROM employees;
-- Average salary
SELECT AVG(salary) AS avg_salary FROM employees;
-- Highest salary
SELECT MAX(salary) AS top_salary FROM employees;
GROUP BY β Aggregate Per Category
GROUP BY splits rows into groups and applies the aggregate function to each group separately.
-- Average salary per department
SELECT
department,
COUNT(*) AS num_employees,
AVG(salary) AS avg_salary,
MAX(salary) AS max_salary
FROM employees
GROUP BY department;
| department | num_employees | avg_salary | max_salary |
|---|---|---|---|
| Engineering | 2 | 100000 | 105000 |
| Marketing | 1 | 72000 | 72000 |
| Sales | 1 | 68000 | 68000 |
HAVING β Filter Groups
WHERE filters rows BEFORE grouping. HAVING filters AFTER grouping.
-- Only departments with avg salary above 75000
SELECT
department,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 75000;
WHERE vs HAVING: Use WHERE to filter individual rows. Use HAVING to filter aggregated groups. You can use both in the same query!
Full Execution Order
Department Stats
Write a query to find the total salary cost per department. Use SUM() and GROUP BY.
Why JOINs?
Real databases have many tables. A user's name is in one table, their orders in another. JOINs let you combine them based on a shared key.
INNER JOIN
Returns only rows that have a match in both tables.
SELECT
e.name,
d.dept_name
FROM employees e
INNER JOIN departments d
ON e.dept_id = d.id;
Types of JOINs
Only matching rows
All from A + matches from B
All from B + matches from A
All rows from both
LEFT JOIN Example
-- All departments, even those with no employees
SELECT
d.dept_name,
e.name
FROM departments d
LEFT JOIN employees e
ON d.id = e.dept_id;
| dept_name | name |
|---|---|
| Engineering | Alice |
| Engineering | Carol |
| Marketing | Bob |
| Sales | NULL |
When to use which? Use INNER JOIN when you only want matches. Use LEFT JOIN when you want all records from the "left" table regardless of match. LEFT JOIN is the most commonly used after INNER JOIN.
Write a JOIN
Write an INNER JOIN to get employee names with their department names from the two tables above.
What is a Subquery?
A subquery is a SELECT statement nested inside another query. It runs first, and its result is used by the outer query.
-- Employees earning above average
SELECT name, salary
FROM employees
WHERE salary > (
SELECT AVG(salary) FROM employees
);
π How It Executes
1. The inner query runs first: SELECT AVG(salary) FROM employees β returns 85000
2. The outer query becomes: SELECT ... WHERE salary > 85000
3. Only Alice (95000) and Carol (105000) are returned.
Subquery with IN
-- Employees in departments that have 'Eng' in the name
SELECT name
FROM employees
WHERE dept_id IN (
SELECT id FROM departments
WHERE dept_name LIKE '%Eng%'
);
Subquery in FROM (Derived Table)
-- Get departments with their employee count, then filter
SELECT *
FROM (
SELECT department, COUNT(*) AS emp_count
FROM employees
GROUP BY department
) AS dept_stats
WHERE emp_count > 1;
EXISTS β Check for Existence
-- Departments that have at least one employee
SELECT dept_name
FROM departments d
WHERE EXISTS (
SELECT 1 FROM employees e
WHERE e.dept_id = d.id
);
Subquery vs JOIN: Many subqueries can be rewritten as JOINs. JOINs are often more readable and performant. Use subqueries when the logic is cleaner that way (like comparing to an aggregate).
INSERT β Add New Rows
-- Insert a single row
INSERT INTO employees (name, department, salary)
VALUES ('Eve', 'Engineering', 98000);
-- Insert multiple rows
INSERT INTO employees (name, department, salary)
VALUES
('Frank', 'Sales', 71000),
('Grace', 'Marketing', 78000);
UPDATE β Modify Existing Rows
-- Give Alice a raise
UPDATE employees
SET salary = 105000
WHERE name = 'Alice';
-- 10% raise for all engineers
UPDATE employees
SET salary = salary * 1.10
WHERE department = 'Engineering';
DANGER: Always include a WHERE clause with UPDATE and DELETE! Without it, you'll modify or delete EVERY row in the table. This is one of the most common and costly SQL mistakes.
DELETE β Remove Rows
-- Delete a specific employee
DELETE FROM employees
WHERE name = 'Dave';
-- Delete all employees in Sales
DELETE FROM employees
WHERE department = 'Sales';
-- β οΈ DANGER: This deletes EVERYTHING!
-- DELETE FROM employees;
π‘οΈ Safety Checklist
Before running UPDATE or DELETE in production:
1. First run a SELECT with the same WHERE clause to see what will be affected
2. Count the rows: is the number what you expect?
3. Use a transaction (BEGIN / COMMIT / ROLLBACK) for safety
4. Back up your data regularly
-- Step 1: Preview what will be affected
SELECT * FROM employees
WHERE department = 'Sales';
-- Step 2: If looks good, then update inside a transaction
BEGIN;
UPDATE employees
SET salary = salary * 1.05
WHERE department = 'Sales';
-- Check the results
SELECT * FROM employees WHERE department = 'Sales';
-- If good:
COMMIT;
-- If bad:
-- ROLLBACK;
CREATE TABLE
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE,
dept_id INTEGER,
salary DECIMAL(10,2) DEFAULT 0,
hired_at DATE DEFAULT CURRENT_DATE
);
Constraints
| Constraint | Purpose |
|---|---|
PRIMARY KEY | Unique identifier for each row. Cannot be NULL. |
NOT NULL | Column must always have a value. |
UNIQUE | No two rows can have the same value in this column. |
DEFAULT | Auto-fill value if none is provided. |
CHECK | Custom validation rule. |
FOREIGN KEY | Links to a primary key in another table. |
Foreign Keys β Relationships
A foreign key creates a link between two tables, enforcing that values in one column must exist in another table.
CREATE TABLE departments (
id INTEGER PRIMARY KEY,
dept_name VARCHAR(100) NOT NULL
);
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
dept_id INTEGER,
FOREIGN KEY (dept_id) REFERENCES departments(id)
);
ALTER TABLE β Modify Structure
-- Add a column
ALTER TABLE employees
ADD COLUMN phone VARCHAR(20);
-- Remove a column
ALTER TABLE employees
DROP COLUMN phone;
-- Drop (delete) an entire table
DROP TABLE employees;
DROP TABLE permanently deletes the table and ALL its data. Always double-check before running this command!
Views β Saved Queries
A view is a virtual table defined by a query. It doesn't store data β it runs the query each time you access it.
-- Create a view for high earners
CREATE VIEW high_earners AS
SELECT name, salary, department
FROM employees
WHERE salary > 90000;
-- Now use it like a regular table
SELECT * FROM high_earners;
Indexes β Speed Up Queries
An index is like a book's index β it helps the database find rows faster without scanning every row.
-- Create an index on the department column
CREATE INDEX idx_department
ON employees(department);
-- Queries filtering by department are now faster
SELECT * FROM employees
WHERE department = 'Engineering';
Index Trade-offs: Indexes speed up reads but slow down writes (INSERT, UPDATE, DELETE). Add indexes on columns you frequently filter or join on.
CTEs β Common Table Expressions
A WITH clause lets you define named temporary result sets, making complex queries much more readable.
WITH dept_stats AS (
SELECT
department,
AVG(salary) AS avg_salary,
COUNT(*) AS emp_count
FROM employees
GROUP BY department
)
SELECT *
FROM dept_stats
WHERE avg_salary > 80000
ORDER BY avg_salary DESC;
Window Functions
Perform calculations across rows related to the current row, without collapsing them into groups.
-- Rank employees by salary within their department
SELECT
name,
department,
salary,
RANK() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS dept_rank
FROM employees;
πͺ Common Window Functions
ROW_NUMBER() β Unique sequential number for each row
RANK() β Rank with gaps for ties (1, 2, 2, 4)
DENSE_RANK() β Rank without gaps (1, 2, 2, 3)
SUM() OVER() β Running total
LAG() / LEAD() β Access previous/next row's value
Running Total Example
SELECT
name,
salary,
SUM(salary) OVER (
ORDER BY salary
) AS running_total
FROM employees;
π― Project: Online Bookstore
Design a database for an online bookstore with authors, books, customers, and orders. Then write queries to answer business questions.
π Schema Design
We need four tables: authors, books, customers, and orders. Think about the relationships between them.
-- 1. Authors
CREATE TABLE authors (
id INTEGER PRIMARY KEY,
name VARCHAR(200) NOT NULL,
country VARCHAR(100)
);
-- 2. Books
CREATE TABLE books (
id INTEGER PRIMARY KEY,
title VARCHAR(300) NOT NULL,
author_id INTEGER REFERENCES authors(id),
price DECIMAL(8,2),
genre VARCHAR(50),
published DATE
);
-- 3. Customers
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
name VARCHAR(200) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
city VARCHAR(100)
);
-- 4. Orders
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER REFERENCES customers(id),
book_id INTEGER REFERENCES books(id),
quantity INTEGER DEFAULT 1,
order_date DATE DEFAULT CURRENT_DATE
);
π Challenges
Try writing SQL for each of these business questions:
Top Selling Books
Find the top 5 best-selling books by total quantity ordered. Show the book title and total quantity.
Revenue Per Author
Calculate total revenue per author (price Γ quantity for all their books). Show author name and total revenue.
Customers Who Never Ordered
Find all customers who have never placed an order. (Hint: LEFT JOIN or NOT EXISTS)
Advanced: Rank Books Per Genre
Use a window function to rank books by price within each genre. Show title, genre, price, and rank.
π Congratulations!
You've completed the SQL Mastery course! You now understand SELECT, WHERE, JOINs, aggregation, subqueries, data modification, schema design, views, indexes, CTEs, and window functions.
Next steps: Practice on LeetCode SQL, HackerRank SQL, or SQLZoo. Try installing PostgreSQL or SQLite locally and build your own projects!