ruk·si

🐘 PostgreSQL
EXPLAIN

Updated at 2015-10-03 10:55

EXPLAIN:

  • Shows what query planner thinks about the query.
  • Shows estimates of the required cost.
  • Use when checking index usage and when the query never finishes.

EXPLAIN ANALYZE

  • Also really executes the query.
  • Shows how the query was executed.
  • Shows which step took the most time.
  • Hints why the query was slow.
  • Use when possible

EXPLAIN (ANALYZE, BUFFERS):

  • Returns even more runtime statistics.
  • Use on I/O-intensive queries.
BEGIN;
EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
ROLLBACK;

Query Planner in a nutshell:

  1. Break query down to atomic nodes.
  2. Figure each possible way to execute each node.
  3. Chain different combinations together and estimate chain cost.
  4. Choose the lowest cost.

Cost is totally arbitrary cost unit for time and resources required.

(cost=12.04..1632.35 rows=425 width=321)
# 12.04 is the cost to return the first row.
# 1632.35 is the cost to return all rows.
# sometimes these are equal, which means it must read all rows to return one.
# upper-level node includes the cost of all its child nodes
# 425 is the estimated number of rows returned by the node.
# 321 size of each row in the table, in bytes.

(actual time=0.033..6.577 rows=2048 loops=1)
# 0.033 how long it took to get the first row.
# 6.577 how long it took to get all the rows.
# 2048 is the actual number of rows returned by the nodes.
# The rows were looped through 1 time.

Deepest parts are executed first in the EXPLAIN result.

Sort  (cost=173.12..178.24 rows=2048 width=107)
  Sort Key: city
  ->  Seq Scan on customer  (cost=0.00..60.48 rows=2048 width=107)
-- goes through the data FIRST, then sorts it.

You get better estimates if you run VACUUM ANALYZE on the table or the whole database.

VACUUM ANALYZE my_table;

Operation Types:

  • Sort: sorts the result like ORDER BY
  • Unique: deduplicate rows like DISTINCT.
  • Limit: drop number of rows like LIMIT.
  • Seq Scan: full table scan, usually bad performance.
  • Index Scan: index scan, usually good performance.
  • Index Only Scan: scan index and ignore heap, good performance but you only gets indexed data.
  • Bitmap Index Scan: scan index and build bitmap of the returned rows
  • Bitmap Heap Scan: using the built bitmap
  • Nested Loop: for each row in A, match rows in B.
  • Hash Join: rows of A are entered into in-memory and matched with rows in B.
  • Merge Join: A and B are sorted, then joined together.
  • Lateral Join: join table against table expression.
  • Semi-Join: partial join like in outer join.
  • Anti-Join: exclusion like NOT IN ().
  • GroupAggregate: regular aggregation.
  • HashAggregate: in-memory aggregation using hash tables.
  • WindowAgg: used for windowing clause expression.
  • CTE Scan: joins common-table expression the the main query.
  • Subquery Scan: joins sub query the the main query.
  • Materialize: create a recordset in memory from a query fragment
  • Append: merge rowsets for UNION ALL and partitions

Sources