Skip to content

Improve EXPLAIN ANALYZE output to prioritize actual data rows over plan nodes #417

@apstndb

Description

@apstndb

Problem

The current EXPLAIN ANALYZE output shows "X rows in set" where X is the number of plan nodes, not the actual data rows affected or returned. This is confusing because:

  1. Plan node count is not meaningful for understanding query results
  2. Users care about actual data rows affected/returned, not plan structure metrics
  3. The current message "X rows in set" misleadingly suggests these are data rows

Current Behavior

EXPLAIN ANALYZE with DML

EXPLAIN ANALYZE UPDATE test_table SET value = 'updated' WHERE id <= 2;
-- Shows: "6 rows in set" (6 plan nodes)
-- Actually affects: 2 rows

EXPLAIN ANALYZE with SELECT

EXPLAIN ANALYZE SELECT * FROM test_table WHERE id <= 2;
-- Shows: "X rows in set" (X plan nodes)  
-- Actually returns: 2 rows

Solution Options

Option 1: Clarify the output message

  • Change "X rows in set" to "X plan nodes" or "X nodes in execution plan"
  • Add affected/returned rows information separately when available

Option 2: Prioritize data rows over plan nodes (Recommended)

  • For EXPLAIN ANALYZE DML: Show "Query OK, X rows affected" (actual affected rows)
  • For EXPLAIN ANALYZE SELECT: Show "X rows returned" or "X rows in result set" (actual data rows)
  • Plan node count can be shown in verbose output if needed

Option 3: Show both metrics clearly

  • "X rows affected/returned, Y plan nodes in execution plan"

Why Option 2 is Recommended

Since plan node count has little practical meaning for users, Option 2 is preferable. Users primarily care about:

  • How many rows were affected by their DML statement
  • How many rows were returned by their SELECT query

The number of nodes in the execution plan is an implementation detail that doesn't directly correlate with query performance or results.

Technical Implementation Notes

  • EXPLAIN ANALYZE DML already sets IsExecutedDML = true and populates AffectedRows with actual affected row count
  • The challenge is distinguishing between:
    • Regular DML with THEN RETURN (has TableHeader, shows "rows in set")
    • EXPLAIN ANALYZE results (has TableHeader, currently shows plan nodes as "rows in set")
  • May need to add a flag like IsExplainAnalyze to properly format the output

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions