Indexes and Basic Performance: Helping PostgreSQL Find Data Faster

Welcome back.

In the previous lesson, you learned aggregate functions.

You learned how to count rows.

Sum values.

Find averages.

Find minimum and maximum values.

Group data.

Filter groups with HAVING.

Very good.

Now PostgreSQL can answer questions like:

How many products do we have?
What is the total stock value?
How many orders does each customer have?
Which category has the most products?

Excellent.

But now another question appears.

A very serious question.

What happens when the table becomes huge?

When a table has ten rows, everything is fast.

When a table has ten million rows, PostgreSQL starts looking at you differently.

Like:

Are you sure about this query?

This is where indexes help.

Indexes help PostgreSQL find data faster.

They are not magic.

But they are powerful.

Like a good map.

Without an index, PostgreSQL may need to scan the whole table.

With an index, PostgreSQL may jump directly to the data it needs.

Very useful.

Very database.

Very “please do not make my server cry”.

What You Will Learn

In this lesson, you will learn:

why queries can become slow;
what an index is;
how indexes help PostgreSQL find rows faster;
what a sequential scan is;
what an index scan is;
how to use EXPLAIN;
how to use EXPLAIN ANALYZE;
how to create an index;
when indexes are useful;
when indexes are not useful;
why indexes are not free;
how indexes affect inserts and updates;
how unique indexes work;
how multi-column indexes work;
common mistakes with indexes.

By the end of this lesson, you will understand the basic idea of database performance.

Not advanced rocket science.

Not yet.

Just enough to stop writing queries that make PostgreSQL look tired.

A noble goal.

Prepare the Database

Open PostgreSQL:

sudo -iu postgres psql

Connect to your database:

\c learning_postgresql

If the database does not exist, create it:

CREATE DATABASE learning_postgresql;

Then connect:

\c learning_postgresql

Now remove old tables if they exist:

DROP TABLE IF EXISTS products;

Create a new products table:

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(100) NOT NULL,
  category VARCHAR(100) NOT NULL,
  price NUMERIC(10, 2) CHECK (price >= 0),
  quantity INTEGER CHECK (quantity >= 0),
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

This table is simple.

But it is enough for learning indexes.

We have:

product name;
category;
price;
quantity;
creation date.

A normal small table.

For now.

Soon it will become a little army of rows.

PostgreSQL is ready.

Maybe.

Insert Some Data

Insert a few products:

INSERT INTO products (name, category, price, quantity)
VALUES
  ('Laptop', 'Electronics', 900.00, 5),
  ('Mouse', 'Electronics', 25.00, 30),
  ('Keyboard', 'Electronics', 70.00, 15),
  ('Desk Chair', 'Furniture', 150.00, 8),
  ('Bookshelf', 'Furniture', 120.00, 4),
  ('Notebook', 'Office', 5.00, 100),
  ('Pen', 'Office', 2.00, 200),
  ('Whiteboard', 'Office', 80.00, 3);

Check the data:

SELECT * FROM products;

This table is tiny.

A tiny table is fast even without indexes.

PostgreSQL can blink and read it.

But real applications do not stay tiny forever.

Data grows.

Logs grow.

Orders grow.

Users grow.

The database grows.

Then one day a query becomes slow and everyone looks at the developer.

Very unfair.

But also sometimes correct.

Add More Rows for Testing

To see performance ideas, we need more rows.

PostgreSQL has a useful function called generate_series.

It can generate many rows.

Run this:

INSERT INTO products (name, category, price, quantity)
SELECT
  'Product ' || number,
  CASE
    WHEN number % 4 = 0 THEN 'Electronics'
    WHEN number % 4 = 1 THEN 'Furniture'
    WHEN number % 4 = 2 THEN 'Office'
    ELSE 'Kitchen'
  END,
  (number % 100) + 1,
  (number % 50) + 1
FROM generate_series(1, 100000) AS number;

This inserts one hundred thousand products.

Not real products.

Fake products.

But useful fake products.

Like training weights for PostgreSQL.

Now check the number of rows:

SELECT COUNT(*) AS total_products
FROM products;

You should see more than one hundred thousand rows.

Now the table is big enough to make performance examples more interesting.

Not huge.

But no longer a tiny toy.

Why Queries Become Slow

A query can be slow for many reasons.

For beginners, the most common reason is simple:

PostgreSQL has to look through too many rows.

Example:

SELECT *
FROM products
WHERE category = 'Electronics';

If there is no index on category, PostgreSQL may need to check every row.

One by one.

Like searching for one paper in a messy garage.

Possible.

But not elegant.

This is called a sequential scan.

Sequential Scan

A sequential scan means PostgreSQL reads the table row by row.

It checks each row and asks:

Does this row match the condition?

For small tables, this is fine.

For large tables, it can be slow.

Example:

SELECT *
FROM products
WHERE category = 'Electronics';

Without an index on category, PostgreSQL may scan the whole table.

This is not always bad.

Sometimes a sequential scan is the best choice.

But if you often search by a column, an index may help.

The important word is often.

Do not create indexes randomly.

Indexes are tools.

Not decorations.

What Is an Index?

An index is a special database structure that helps PostgreSQL find rows faster.

Think of a book.

If a book has no index, you may need to scan page after page.

If a book has an index, you can look up a topic and jump to the right page.

A database index works with the same idea.

Without index:

Read many rows and check them.

With index:

Use the index to find matching rows faster.

PostgreSQL indexes are usually stored separately from the table.

The table contains the actual data.

The index contains organized pointers to that data.

Very simplified.

But good enough for now.

Like a GPS for rows.

Without annoying voice instructions.

Check a Query with EXPLAIN

PostgreSQL can show how it plans to run a query.

Use EXPLAIN.

EXPLAIN
SELECT *
FROM products
WHERE category = 'Electronics';

You may see something like:

Seq Scan on products
  Filter: ((category)::text = 'Electronics'::text)

The exact output may be different.

But the important part is:

Seq Scan

That means PostgreSQL plans to scan the table.

It will read rows and filter them.

This is useful information.

PostgreSQL is telling you its plan.

Like a database GPS.

But with fewer colors.

EXPLAIN ANALYZE

EXPLAIN shows the plan.

EXPLAIN ANALYZE actually runs the query and shows what happened.

Run:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE category = 'Electronics';

You may see:

Seq Scan on products
  Filter: ((category)::text = 'Electronics'::text)
  Rows Removed by Filter: ...
Planning Time: ...
Execution Time: ...

The exact numbers depend on your machine.

The important parts are:

Seq Scan
Execution Time
Rows Removed by Filter

Rows Removed by Filter means PostgreSQL checked rows and rejected many of them.

That is work.

Sometimes necessary work.

Sometimes avoidable work.

EXPLAIN ANALYZE is very useful.

But remember:

It runs the query.

So be careful with queries that change data.

Do not casually run this on a dangerous DELETE.

PostgreSQL will not say:

Are you emotionally ready?

It will just execute.

Create an Index

Now create an index on category:

CREATE INDEX idx_products_category
ON products(category);

The name idx_products_category is just a name.

A common naming style is:

idx_table_column

Now run the query again:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE category = 'Electronics';

You may see an index being used.

For example:

Bitmap Index Scan
Bitmap Heap Scan

or:

Index Scan

The exact plan depends on PostgreSQL and your data.

Important point:

PostgreSQL now has another option.

It can use the index.

Without the index, it had to scan the table.

With the index, it may find matching rows faster.

May.

Not always.

PostgreSQL chooses what it thinks is best.

It is clever.

Usually.

Why PostgreSQL May Still Use Sequential Scan

Sometimes you create an index and PostgreSQL still uses a sequential scan.

Do not panic.

This can be normal.

Why?

Because if many rows match the condition, reading the whole table may be faster.

Example:

SELECT *
FROM products
WHERE category = 'Electronics';

If twenty-five percent of the table is Electronics, PostgreSQL may decide:

Many rows match. I will scan the table.

An index is most useful when the query returns a small part of the table.

For example:

SELECT *
FROM products
WHERE id = 50000;

This is very selective.

Only one row.

Great for an index.

Searching for one product by id is fast because id is already indexed by the primary key.

Primary keys automatically create indexes.

PostgreSQL did that for you.

Very kind.

Very professional.

Primary Key Index

When you create this:

id SERIAL PRIMARY KEY

PostgreSQL automatically creates a unique index for id.

That is why this query is fast:

SELECT *
FROM products
WHERE id = 50000;

Check it:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE id = 50000;

You should see something like:

Index Scan using products_pkey

products_pkey is the primary key index.

You did not create it manually.

PostgreSQL created it because primary keys must be unique and searchable.

This is one reason primary keys are important.

They are not just decoration.

They are database identity cards.

Index on Price

Let us create an index on price.

CREATE INDEX idx_products_price
ON products(price);

Now run:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE price = 50.00;

PostgreSQL may use the index.

This depends on the data.

Now try:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE price > 90.00;

An index can help with range searches too.

Examples:

WHERE price = 50.00
WHERE price > 90.00
WHERE price BETWEEN 10.00 AND 20.00

Indexes are especially useful when you search, filter, sort, or join using a column.

But again:

Do not create indexes on every column.

That is not optimization.

That is database confetti.

And confetti is annoying to clean.

Index and ORDER BY

Indexes can also help with sorting.

Example:

EXPLAIN ANALYZE
SELECT *
FROM products
ORDER BY price ASC
LIMIT 10;

Because we created an index on price, PostgreSQL may use it to find the cheapest products faster.

This is useful for queries like:

Show the cheapest products.
Show the most expensive products.
Show the newest posts.
Show the latest orders.

A common example is an index on created_at.

CREATE INDEX idx_products_created_at
ON products(created_at);

Then this query may become faster:

SELECT *
FROM products
ORDER BY created_at DESC
LIMIT 10;

This is common in real applications.

Blogs.

Orders.

Messages.

Logs.

Everything wants the latest ten things.

Because apparently humans love “latest”.

Index and JOIN

Indexes can help with joins too.

Imagine two tables:

orders
customers

Usually:

orders.customer_id references customers.id

When you join:

SELECT
  o.id,
  c.name,
  o.total
FROM orders AS o
JOIN customers AS c
ON o.customer_id = c.id;

PostgreSQL needs to match rows.

Indexes on join columns can help.

Primary keys are indexed automatically.

So customers.id is already indexed if it is a primary key.

But foreign keys are not always automatically indexed.

That means orders.customer_id may need an index if you often join or filter by it.

Example:

CREATE INDEX idx_orders_customer_id
ON orders(customer_id);

This is a very common real-world index.

Foreign key columns are often good index candidates.

Not always.

But often.

Database answer number one:

It depends.

Annoying.

But true.

Unique Index

A unique index prevents duplicate values.

Example:

DROP TABLE IF EXISTS users;

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) NOT NULL,
  name VARCHAR(100) NOT NULL
);

Create a unique index:

CREATE UNIQUE INDEX idx_users_email_unique
ON users(email);

Insert a user:

INSERT INTO users (email, name)
VALUES ('anna@example.com', 'Anna');

Try to insert another user with the same email:

INSERT INTO users (email, name)
VALUES ('anna@example.com', 'Another Anna');

PostgreSQL will reject it.

Good.

Emails should usually be unique.

A unique index does two things:

helps search faster;
protects data from duplicates.

Very useful.

Very responsible.

Like a security guard for your table.

But with less sunglasses.

UNIQUE Constraint or Unique Index?

You can also write:

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  name VARCHAR(100) NOT NULL
);

This creates a unique constraint.

PostgreSQL also creates a unique index behind the scenes.

For beginners, use UNIQUE in the table definition when it is part of the data rule.

Example:

Email must be unique.
Username must be unique.
Product code must be unique.

Use explicit indexes when you are improving search performance.

Simple rule:

Data rule: constraint.
Performance help: index.

Not perfect.

But very practical.

Multi-Column Indexes

An index can include more than one column.

Example:

CREATE INDEX idx_products_category_price
ON products(category, price);

This can help queries like:

SELECT *
FROM products
WHERE category = 'Electronics'
ORDER BY price ASC;

The index is ordered by category first, then by price.

Column order matters.

This index is useful for:

WHERE category = 'Electronics'

and:

WHERE category = 'Electronics'
ORDER BY price ASC

But it may be less useful for:

WHERE price = 50.00

Why?

Because price is the second column in the index.

The first column is category.

Multi-column indexes are powerful.

But you need to create them based on real queries.

Not based on vibes.

Databases do not optimize vibes.

Sadly.

Indexes Are Not Free

Indexes make some reads faster.

But they have costs.

An index:

uses disk space;
must be updated when data changes;
can slow down INSERT;
can slow down UPDATE;
can slow down DELETE;
adds complexity.

When you insert a row, PostgreSQL must insert data into the table.

But it must also update indexes.

If a table has many indexes, writing becomes heavier.

So do not create indexes everywhere.

A table with too many indexes is like a person carrying ten backpacks.

Maybe prepared.

Definitely slower.

When Should You Create an Index?

Create an index when:

you often search by a column;
you often filter by a column with WHERE;
you often sort by a column with ORDER BY;
you often join using a column;
the table is large enough for performance to matter;
the query returns a small part of the table;
EXPLAIN ANALYZE shows a slow scan.

Examples:

WHERE email = 'anna@example.com'

WHERE customer_id = 10

WHERE created_at >= '2026-01-01'

ORDER BY created_at DESC LIMIT 10

JOIN orders ON orders.customer_id = customers.id

These are common places where indexes can help.

But do not guess blindly.

Measure.

Then index.

Then measure again.

That is the grown-up way.

Slightly boring.

Very effective.

When an Index May Not Help

An index may not help when:

the table is very small;
the query returns most of the table;
the column has very few different values;
you use the column in a way that prevents index usage;
the index does not match the query;
the data changes very often and reads are rare.

Example:

WHERE category = 'Electronics'

If only four categories exist and each category has many rows, the index may not help much.

Another example:

WHERE quantity > 0

If almost every product has quantity greater than zero, PostgreSQL may prefer a sequential scan.

Because almost everything matches.

An index is useful when it can avoid work.

If the query needs almost everything, there is not much work to avoid.

Sad but logical.

Functions Can Hide Indexes

Be careful with functions in WHERE.

Suppose you have an index on email.

CREATE INDEX idx_users_email
ON users(email);

This query can use the index:

SELECT *
FROM users
WHERE email = 'anna@example.com';

But this query may not use the normal index:

SELECT *
FROM users
WHERE LOWER(email) = 'anna@example.com';

Why?

Because PostgreSQL is not searching the raw email value.

It is searching the result of LOWER(email).

That is different.

For this kind of query, you may need an expression index:

CREATE INDEX idx_users_lower_email
ON users(LOWER(email));

Now PostgreSQL can use the index for:

SELECT *
FROM users
WHERE LOWER(email) = 'anna@example.com';

This is more advanced.

But important.

If your query transforms the column, a normal index may not help.

The database is clever.

But it does not read minds.

Thankfully.

LIKE and Indexes

Indexes can sometimes help with LIKE.

Example:

SELECT *
FROM products
WHERE name LIKE 'Product 12%';

This searches for names that start with Product 12.

This can use an index in some situations.

But this query is harder:

SELECT *
FROM products
WHERE name LIKE '%12';

Why?

Because the pattern starts with a wildcard.

PostgreSQL cannot easily jump to the beginning of the value.

It has to search more.

Simple rule:

LIKE 'abc%' can be index-friendly.
LIKE '%abc' is usually not index-friendly.
LIKE '%abc%' is usually not index-friendly.

There are special index types for advanced text search.

But for now, remember this:

Searching from the beginning is easier.

Searching from the middle is harder.

Like finding a book title when you only remember one word from the middle.

Good luck, brave librarian.

Drop an Index

Sometimes you create an index and later decide it is not useful.

You can remove it.

Example:

DROP INDEX IF EXISTS idx_products_price;

This removes the index.

It does not remove the table.

It does not remove the data.

Only the index.

Still, be careful in real projects.

Indexes may be used by important queries.

Dropping the wrong index can make things slow.

And then the server begins its dramatic opera.

Common Mistakes

Creating Indexes on Every Column

Bad idea:

Every column gets an index!

No.

Stop.

This is not a Christmas tree.

Indexes have costs.

Create indexes based on real queries.

Not Using EXPLAIN ANALYZE

Guessing is not performance work.

Use:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE category = 'Electronics';

Measure before and after.

Without measuring, you are just doing database astrology.

Expecting Indexes to Fix Bad Queries

Indexes help.

But they do not fix everything.

A terrible query can still be terrible with an index.

If you select too much data, join too many things badly, or filter in a strange way, indexes may not save you.

Indexes are tools.

Not miracles.

Forgetting That Indexes Slow Writes

If your table receives many inserts and updates, too many indexes can hurt performance.

Reads may become faster.

Writes may become slower.

There is always a trade-off.

The database gives.

The database takes.

Very philosophical.

Ignoring Foreign Key Indexes

Foreign keys are important for relationships.

But if you often join or filter by a foreign key column, consider indexing it.

Example:

CREATE INDEX idx_orders_customer_id
ON orders(customer_id);

This is common in real applications.

Not always necessary.

But very often useful.

Practice

Create an index on category:

CREATE INDEX idx_products_category
ON products(category);

Check a query:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE category = 'Electronics';

Create an index on price:

CREATE INDEX idx_products_price
ON products(price);

Check a price query:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE price BETWEEN 10.00 AND 20.00;

Check sorting:

EXPLAIN ANALYZE
SELECT *
FROM products
ORDER BY price ASC
LIMIT 10;

Create an index on created_at:

CREATE INDEX idx_products_created_at
ON products(created_at);

Check latest products:

EXPLAIN ANALYZE
SELECT *
FROM products
ORDER BY created_at DESC
LIMIT 10;

Create a multi-column index:

CREATE INDEX idx_products_category_price
ON products(category, price);

Check this query:

EXPLAIN ANALYZE
SELECT *
FROM products
WHERE category = 'Office'
ORDER BY price ASC
LIMIT 10;

Run these queries.

Look at the output.

Do not worry if the plan is different from the examples.

PostgreSQL chooses plans based on your data, your indexes, your settings, and your machine.

The goal is not to memorize every plan.

The goal is to understand what PostgreSQL is trying to do.

That is enough for now.

Mini Challenge

Create two tables:

customers
orders

Rules:

one customer can have many orders;
each order belongs to one customer;
customer email must be unique;
orders have a total value;
orders have a creation date.

Create the tables:

DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS customers;

CREATE TABLE customers (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  name VARCHAR(100) NOT NULL
);

CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  customer_id INTEGER REFERENCES customers(id),
  total NUMERIC(10, 2) CHECK (total >= 0),
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Insert customers:

INSERT INTO customers (email, name)
VALUES
  ('anna@example.com', 'Anna'),
  ('marco@example.com', 'Marco'),
  ('sofia@example.com', 'Sofia');

Insert orders:

INSERT INTO orders (customer_id, total)
SELECT
  (number % 3) + 1,
  (number % 200) + 10
FROM generate_series(1, 50000) AS number;

Now create useful indexes:

CREATE INDEX idx_orders_customer_id
ON orders(customer_id);

CREATE INDEX idx_orders_created_at
ON orders(created_at);

CREATE INDEX idx_orders_total
ON orders(total);

Test these queries with EXPLAIN ANALYZE:

EXPLAIN ANALYZE
SELECT *
FROM customers
WHERE email = 'anna@example.com';

EXPLAIN ANALYZE
SELECT *
FROM orders
WHERE customer_id = 1;

EXPLAIN ANALYZE
SELECT *
FROM orders
ORDER BY created_at DESC
LIMIT 10;

EXPLAIN ANALYZE
SELECT *
FROM orders
WHERE total > 150.00;

Then ask yourself:

Which indexes were used?
Which queries still used a sequential scan?
Why?
Did the query return many rows or few rows?

This is how real performance work begins.

Not by guessing.

By observing.

PostgreSQL leaves clues.

You just need to read them.

Like a detective.

But with more semicolons.

Summary

Today you learned:

queries can become slow when tables grow;
a sequential scan reads rows one by one;
an index helps PostgreSQL find rows faster;
primary keys automatically create indexes;
EXPLAIN shows the query plan;
EXPLAIN ANALYZE runs the query and shows real execution details;
indexes can help with WHERE;
indexes can help with ORDER BY;
indexes can help with JOIN;
unique indexes prevent duplicate values;
multi-column indexes depend on column order;
indexes use disk space;
indexes can slow down inserts, updates, and deletes;
not every column needs an index;
performance should be measured, not guessed.

This is a very important step.

You are no longer just writing SQL.

You are starting to think about how PostgreSQL executes SQL.

That is a different level.

A better level.

A level where the database still scares you a little.

But now you have tools.

Next Lesson

In the next lesson, we will build a small practical PostgreSQL project.

We will design tables.

Create relationships.

Insert data.

Run queries.

Use JOIN.

Use aggregate functions.

And think about indexes.

In other words, we will put the course pieces together.

Like assembling furniture.

But with fewer missing screws.

Hopefully.