Monitor and Analyze Transaction Contention

On this page

It is crucial to understand where transaction contention occurs and how it impacts your workload. Contention is a normal part of database operations. In many cases, it has limited impact on a workload. For example, a large number of contention events with a very short duration may not impact your workload. Similarly, a small number of contention events with longer duration may be acceptable if they occur outside the “user path”. However, if contention substantially contributes to query latency, it should be addressed.

This page shows how to monitor and analyze two types of contention: lock contention and serializable conflicts. These types of contention are exposed via different observability mechanisms:

Console
Metrics
crdb_internal tables

The Continuous monitoring section provides further considerations on identifying when contention occurs in your workload.

The remaining sections address analyzing contention once identified:

Analyze using crdb_internal tables: A process to determine which workload and queries are involved in the contention.
Analysis of support scenario: An application of the previously described analysis process to a hypothetical support incident.
Analyze using Insights page: Basic examples of lock contention are addressed.

Monitor using console

The CockroachDB Cloud Console and DB Console provide monitoring of contention using the:

SQL Statement Contention graph
SQL Activity
Insights page

Note:

Contention metrics are typically presented with count and duration information. Both lock contention and serializable conflicts will be recorded in count-related metrics, but only lock contention will be included in duration-related metrics.

SQL Statement Contention graph

On the SQL dashboard within the Metrics page, the SQL Statement Contention graph (CockroachDB Cloud Console or DB Console) shows the rate of contention events per second across the cluster. This graph is based on the sql.distsql.contended_queries.count metric, which measures the rate of contention events per second across the cluster.

In the DB Console, in the cluster view, this rate is averaged across all nodes, which sometimes makes it difficult to interpret. When viewing an individual node, it shows the rate for that single node.

The following image from the DB Console was taken from a cluster running more than 50,000 queries per second with around 2 contention events per second. This contention is unlikely to have an impact on the workload.

DB Console SQL Statement Contention graph

SQL Activity

SQL Activity has per statement and per transaction contention information.

Statements page

The Statements page (CockroachDB Cloud Console or DB Console) shows statements that were involved in contention events. To view these statements, change the search criteria to view the top 100 statement fingerprints by Contention Time, then sort descending by Contention Time column.

The Contention Time column displays the average time and standard deviation of contention events. As noted earlier, only lock contention events are included in the contention time.

The following image shows the Statements page with the top 3 statement fingerprints by Contention Time in a cluster containing the test data from the Analyze using crdb_internal tables section.

Statements page by Contention Time

Transactions page

The Transactions page (CockroachDB Cloud Console or DB Console) shows transactions that were involved in contention events. To view these transactions, change the search criteria to view the top 100 transaction fingerprints by Contention Time, then sort descending by Contention Time column.

The Contention Time column displays the average and standard deviation of the time spent in contention. As noted earlier, only lock contention events are included in the contention time.

The following image shows the Transactions page with the top 3 transactions fingerprints by Contention Time in a cluster containing the test data in the Analyze using crdb_internal tables section.

Transactions page by Contention Time

Insights page

The Insights page (CockroachDB Cloud Console or DB Console) displays a variety of workload insights on statement and transaction executions related to transaction contention:

Slow Execution

A statement or transaction is classified as Slow Execution if it meets either of the following criteria:
- The execution took longer than the configured sql.insights.latency_threshold.
- Anomaly detection is enabled (sql.insights.anomaly_detection.enabled) and the execution was greater than twice the median execution latency and higher than sql.insights.anomaly_detection.latency_threshold.
High Contention

After meeting the Slow Execution criteria, an execution is classified as High Contention if a significant portion of its duration was due to contention.
Failed Execution

A statement or transaction that failed due to a serialization error is identified as a Failed Execution. If the serialization error was due to contention, there is a conflicting or blocking transaction recorded. The execution details will show the conflicting transaction.

Several workload insights settings determine how long workload insights are persisted. If clusters are generating a large number of insights, insights may only be persisted for a short period of time. The sql.insights.execution_insights_capacity setting determines the maximum number of insights stored per node, and the sql.contention.event_store.capacity setting determines the maximum storage used per node for contention events.

Monitor using metrics

As part of normal operation, CockroachDB continuously records metrics that are often useful in troubleshooting performance. The way metrics are exposed depends on the deployment type. Refer to:

The following metrics related to contention are available:

CockroachDB Metric Name	Description	Type	Unit
`sql.distsql.contended_queries.count`	Number of SQL queries that experienced contention Note: Only lock contention counts are included in this metric.	COUNTER	COUNT
`sql.distsql.cumulative_contention_nanos`	Cumulative contention across all queries (in nanoseconds) Note: Only lock contention durations are included in this metric.	COUNTER	NANOSECOND
`txn.restarts`	Number of restarted KV transactions	HISTOGRAM	COUNT
`txn.restarts.asyncwritefailure`	Number of restarts due to async consensus writes that failed to leave intents	COUNTER	COUNT
`txn.restarts.commitdeadlineexceeded`	Number of restarts due to a transaction exceeding its deadline	COUNTER	COUNT
`txn.restarts.readwithinuncertainty`	Number of restarts due to reading a new value within the uncertainty interval	COUNTER	COUNT
`txn.restarts.serializable`	Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE	COUNTER	COUNT
`txn.restarts.txnaborted`	Number of restarts due to an abort by a concurrent transaction (usually due to deadlock)	COUNTER	COUNT
`txn.restarts.txnpush`	Number of restarts due to a transaction push failure	COUNTER	COUNT
`txn.restarts.unknown`	Number of restarts due to a unknown reasons	COUNTER	COUNT
`txn.restarts.writetooold`	Number of restarts due to a concurrent writer committing first	COUNTER	COUNT

Note:

Since these metrics are not categorized by database, interpreting them can be challenging.

Monitor using `crdb_internal` tables

The crdb_internal system catalog is a schema that contains information about internal objects, processes, and metrics related to a specific database. crdb_internal tables are read-only.

`transaction_contention_events` table

The crdb_internal.transaction_contention_events virtual table contains information about historical transaction contention events. By default, lock contention events and serializable conflicts with a blocking transaction are stored in memory on each node and exposed via SQL in this virtual table.

Warning:

Querying crdb_internal.transaction_contention_events triggers RPCs to every node, making it a resource-intensive operation. Avoid frequent polling to minimize resource usage.

Several cluster settings control how events are logged to this table. In most cases, the default settings are sufficient.

Setting	Type	Default	Description
`sql.contention.event_store.capacity`	byte size	`64 MiB`	the in-memory storage capacity per-node of contention event store
`sql.contention.event_store.duration_threshold`	duration	`0s`	minimum contention duration to cause the contention events to be collected into crdb_internal.transaction_contention_events. If `0`, always store.
`sql.contention.event_store.resolution_interval`	duration	`30s`	the interval at which transaction fingerprint ID resolution is performed (set to 0 to disable)
`sql.contention.record_serialization_conflicts.enabled`	boolean	`true`	enables recording 40001 errors with conflicting txn meta as SERIALIZATION_CONFLICT contention events into crdb_internal.transaction_contention_events

Note:

crdb_internal.transaction_content_events displays in-memory data, not on-disk data. The data changes over time as new events arrive and old ones are evicted from the cache.

The table columns are as follows:

Column	Type	Description
`collection_ts`	`TIMESTAMPTZ NOT NULL`	The timestamp when the transaction contention event was collected.
`blocking_txn_id`	`UUID NOT NULL`	The ID of the blocking transaction. You can join this column into the `cluster_contention_events` table.
`blocking_txn_fingerprint_id`	`BYTES NOT NULL`	The ID of the blocking transaction fingerprint. To surface historical information about the transactions that caused the contention, you can join this column into the `statement_statistics` and `transaction_statistics` tables.
`waiting_txn_id`	`UUID NOT NULL`	The ID of the waiting transaction. You can join this column into the `cluster_contention_events` table.
`waiting_txn_fingerprint_id`	`BYTES NOT NULL`	The ID of the waiting transaction fingerprint. To surface historical information about the transactions that caused the contention, you can join this column into the `statement_statistics` and `transaction_statistics` tables.
`waiting_stmt_id`	`STRING NOT NULL`	The statement id of the transaction that was waiting (unique for each statement execution).
`waiting_stmt_fingerprint_id`	`BYTES NOT NULL`	The ID of the waiting statement fingerprint. To surface historical information about the statements that caused the contention, you can join this column into the `statement_statistics` table.
`contention_duration`	`INTERVAL NOT NULL`	The interval of time the waiting transaction spent waiting for the blocking transaction.
`contending_key`	`BYTES NOT NULL`	The key on which the transactions contended.
`contending_pretty_key`	`STRING NOT NULL`	The specific key that was involved in the contention event, in readable format.
`database_name`	`STRING NOT NULL`	The database where the contention occurred.
`schema_name`	`STRING NOT NULL`	The schema where the contention occurred.
`table_name`	`STRING NOT NULL`	The table where the contention occurred.
`index_name`	`STRING NULL`	The index where the contention occurred.
`contention_type`	`STRING NOT NULL`	The type of contention. Possible values: `LOCK_WAIT`: Indicates that the transaction waited on a specific key. The record includes the key and the wait duration. `SERIALIZATION_CONFLICT`: Represents a serialization conflict specific to a transaction execution. This is recorded only when a client-side retry error containing the conflicting transaction details is emitted. After recording, the `contention_type` is not modified. A transaction may have multiple `LOCK_WAIT` events, as they correspond to specific keys, but only one `SERIALIZATION_CONFLICT` event.

The transaction and statement fingerprint IDs (*_txn_fingerprint_id and *_stmt_fingerprint_id) columns are hexadecimal values that can be used to look up the SQL and other information from the crdb_internal.transaction_statistics and crdb_internal.statement_statistics tables. These should not be confused with the transaction and statement IDs (*_txn_id and *_stmt_id), which are unique for each transaction and statement execution.

The crdb_internal.transaction_contention_events table can be used to summarize the frequency and duration of contention events by database, schema, table and index. This can be useful for determining which workloads have the highest frequency and duration of contention events and which indexes they are occurring on. The table includes details on specific transactions and statements involved in contention, enabling deeper analysis.

Other tables

There are several other tables that can give insights into contention.

crdb_internal.cluster_contended_indexes
crdb_internal.cluster_contended_tables
crdb_internal.cluster_contended_keys

These are views based on crdb_internal.transaction_contention_events that group all events by index, table, and key, respectively. A start and end time cannot be provided, so they count every event that is currently stored. As a result, these act more like counter metrics, which can be misleading since high values do not always indicate a high contention rate.
crdb_internal.cluster_locks

This is a virtual table that contains information about locks held by transactions on specific keys. It only shows this information for locks held at the time the table is queried.

Warning:

Similar to crdb_internal.transaction_contention_events, crdb_internal.cluster_locks triggers an RPC fan out to all nodes in the cluster to capture this information, so it can be a relatively expensive operation to query it.

Continuous monitoring

CockroachDB offers tools to monitor contention, but it is important to turn this data into actionable insights. The main question is: “Does contention significantly affect this workload?” Ideally, continuous monitoring should answer this without manual effort.

At the cluster level, track how often contention happens compared to overall throughput. For example, divide the rate of contention events (sql.distsql.contended_queries.count metric) by the total number of queries (sql.query.count metric) to get the percentage of queries affected by contention. This provides a high-level indicator of how frequently contention occurs but does not identify specific workloads.

If contention affects more than 1% of queries, use the crdb_internal table analysis to identify the impacted workloads and queries.

Warning:

Because querying the crdb_internal.transaction_contention_events table requires an expensive RPC fan-out, it is not recommended for use as a part of a continuous monitoring system.

Analyze using `crdb_internal` tables

To analyze contention causes, use crdb_internal.transaction_contention_events, crdb_internal.transaction_statistics and crdb_internal.statement_statistics tables.

First, retrieve the frequency and duration of contention events for a specific time period. These can be summarized at a database, schema, table, and index level to make it easy to identify if any workload has significant lock contention. Since some contention is expected, compare the frequency and duration of contention events against overall workload volume to assess the impact.

Next, individual transactions and statements can be analyzed to understand the specific causes and impacts of the contention. The transaction and statement fingerprints can be used to look up detailed information from the crdb_internal.transaction_statistics and crdb_internal.statement_statistics tables, such as the specific SQL.

Run `insights` workload

Use two terminals to analyze contention in the sample insights workload.

In Terminal 1, use the cockroach demo command to start a temporary, in-memory CockroachDB cluster of one node. Once the cluster is started, an interactive SQL shell will connect to that cluster. This shell will be used in the analysis process.
```
cockroach demo --no-example-database --insecure
```

In Terminal 2, use cockroach workload to load the initial schema on the demo cluster:

cockroach workload init insights 'postgresql://root@localhost:26257?sslmode=disable'

Then, run the workload for 10 minutes:

cockroach workload run insights --concurrency=128 --duration=10m 'postgresql://root@localhost:26257?sslmode=disable'

You'll see per-operation statistics print to standard output every second:

_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
    1.0s        0            4.0            4.0     25.2     60.8     60.8     60.8 joinOnNonIndexColumn
    1.0s        0          127.7          127.8     39.8     71.3     79.7     79.7 orderByOnNonIndexColumn
    1.0s        0          127.7          127.8     62.9     75.5     83.9     83.9 transfer
    2.0s        0            0.5            0.5    973.1    973.1    973.1    973.1 contention
    ...

After the specified duration (10 minutes in this case), the workload will stop and you'll see totals printed to standard output:

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
  600.0s        0          23739           39.6   3226.8   3758.1   6174.0   7784.6  12884.9

In the interactive SQL shell in Terminal 1, run the SQL queries in this test scenario.

`transaction_contention_events`

Query 1 time period

When analyzing contention in a workload, you need the time period when it occurred. For this test scenario, get the time period of contention events by finding the min and max of the collection_ts column. Adjust Query 2 and Query 3 with this time period clause.

SELECT CONCAT('BETWEEN ''',MIN(collection_ts),''' AND ''', MAX(collection_ts),'''') FROM crdb_internal.transaction_contention_events;

                                    concat
-------------------------------------------------------------------------------
  BETWEEN '2025-02-18 20:16:18.906348+00' AND '2025-02-18 20:56:25.030577+00'

Query 2 frequency and duration

Find the frequency, total duration, and average duration of contention events per database, schema, table, and index for the test time period. Internal contention events (e.g., to system tables) are excluded by omitting the fingerprints with the pattern '0000000000000000'.

SELECT COUNT(*) as cnt,
  SUM(contention_duration) as duration,                
  SUM(contention_duration)/count(*) as avg_duration,
  database_name, schema_name, table_name, index_name, contention_type
FROM crdb_internal.transaction_contention_events
WHERE collection_ts BETWEEN '2025-02-18 20:16:18.906348+00' AND '2025-02-18 20:56:25.030577+00'
  AND encode(blocking_txn_fingerprint_id, 'hex') != '0000000000000000'
  AND encode(waiting_txn_fingerprint_id, 'hex') != '0000000000000000'
GROUP BY database_name, schema_name, table_name, index_name, contention_type
ORDER BY COUNT(*) desc;

  cnt  |    duration     |  avg_duration   | database_name | schema_name |        table_name         |           index_name           | contention_type
-------+-----------------+-----------------+---------------+-------------+---------------------------+--------------------------------+------------------
  1281 | 00:11:37.329786 | 00:00:00.544364 | insights      | public      | insights_workload_table_0 | insights_workload_table_0_pkey | LOCK_WAIT
   157 | 00:04:32.919447 | 00:00:01.73834  | insights      | public      | insights_workload_table_1 | insights_workload_table_1_pkey | LOCK_WAIT

These results show that the primary key insights_workload_table_0_pkey had 1281 contention events in the time period with a total duration of 11m37.3s and an average duration of 544ms.

Query 3 blocking and waiting transaction fingerprint IDs

To find the top 2 combinations of blocking transaction and waiting transactions and statements, modify the previous query by:

Grouping by blocking_txn_fingerprint_id, waiting_txn_fingerprint_id,waiting_stmt_fingerprint_id.
Filtering by database_name, table_name, and index_name.
Then, placing the resulting query in a common table expression (CTE).

With x AS (
SELECT COUNT(*) AS cnt, 
  SUM(contention_duration) as duration,
  SUM(contention_duration)/count(*) as avg_duration,
  blocking_txn_fingerprint_id, waiting_txn_fingerprint_id, waiting_stmt_fingerprint_id,
  database_name, schema_name, table_name, index_name, contention_type
FROM crdb_internal.transaction_contention_events
WHERE collection_ts BETWEEN '2025-02-18 20:16:18.906348+00' AND '2025-02-18 20:56:25.030577+00'
  AND encode(blocking_txn_fingerprint_id, 'hex') != '0000000000000000'
  AND encode(waiting_txn_fingerprint_id, 'hex') != '0000000000000000'
  AND database_name = 'insights' AND table_name = 'insights_workload_table_0' AND index_name = 'insights_workload_table_0_pkey'
GROUP BY blocking_txn_fingerprint_id, waiting_txn_fingerprint_id, waiting_stmt_fingerprint_id, database_name, schema_name, table_name, index_name, contention_type
) 
SELECT row_number() OVER (), *
  FROM (
        SELECT cnt, blocking_txn_fingerprint_id, waiting_txn_fingerprint_id, waiting_stmt_fingerprint_id
        FROM x ORDER BY cnt DESC LIMIT 2
  );

  row_number | cnt | blocking_txn_fingerprint_id | waiting_txn_fingerprint_id | waiting_stmt_fingerprint_id
-------------+-----+-----------------------------+----------------------------+------------------------------
           1 | 551 | \xebdfe9282ddfd5bd          | \x275ef4f9eea20099         | \x883d49b568a3b746
           2 | 288 | \x966b8907aeed74ec          | \x9df582e8f8802926         | \x32963fa47e819ef9

`transaction_statistics`

Query 4 blocking statement fingerprint IDs

To get the statements associated with the blocking transaction (\xebdfe9282ddfd5bd) from row_number 1, we can query the crdb_internal.transaction_statistics table:

SELECT fingerprint_id as blocking_txn_fingerprint_id, app_name, metadata->>'stmtFingerprintIDs' AS blocking_stmt_fingerprint_ids
FROM crdb_internal.transaction_statistics
WHERE fingerprint_id = '\xebdfe9282ddfd5bd' LIMIT 1;

  blocking_txn_fingerprint_id | app_name |                blocking_stmt_fingerprint_ids
------------------------------+----------+---------------------------------------------------------------
  \xebdfe9282ddfd5bd          | insights | ["4bb36019649a9816", "47778d05a0e698a4", "845cbd5dab02f860"]

`statement_statistics`

Query 5 blocking statement SQL

To get the SQL associated with the blocking statements, query the crdb_internal.statement_statistics table using the fingerprint IDs previously found.

SELECT fingerprint_id as blocking_stmt_fingerprint_id, app_name, metadata->>'query' AS query
FROM crdb_internal.statement_statistics
WHERE transaction_fingerprint_id='\xebdfe9282ddfd5bd' and fingerprint_id in ('\x4bb36019649a9816','\x47778d05a0e698a4','\x845cbd5dab02f860')
ORDER BY fingerprint_id

  blocking_stmt_fingerprint_id | app_name |                                  query
-------------------------------+----------+--------------------------------------------------------------------------
  \x47778d05a0e698a4           | insights | SELECT pg_sleep(_)
  \x4bb36019649a9816           | insights | UPDATE insights_workload_table_0 SET balance = balance - _ WHERE id = _
  \x845cbd5dab02f860           | insights | UPDATE insights_workload_table_0 SET balance = balance + _ WHERE id = _

Query 6 waiting statement SQL

To get the SQL associated with the waiting statement (\x883d49b568a3b746 of transaction \x275ef4f9eea20099), again query the crdb_internal.statement_statistics table.

SELECT fingerprint_id as waiting_stmt_fingerprint_id, app_name, metadata->>'query' AS query
FROM crdb_internal.statement_statistics
WHERE transaction_fingerprint_id='\x275ef4f9eea20099' and fingerprint_id = '\x883d49b568a3b746'

  waiting_stmt_fingerprint_id | app_name |                                    query
------------------------------+----------+------------------------------------------------------------------------------
  \x883d49b568a3b746          | insights | SELECT balance FROM insights_workload_table_0 ORDER BY balance DESC LIMIT _

A similar process can be applied to the second set (row_number 2) of blocking and waiting transactions and statements.

This process provides a view of the actual statements that are involved in the highest frequency of contention events.

Analysis of support scenario

This section applies a variation of the previously described analysis process to a hypothetical support scenario where high contention occurred during a period of increased errors.

Review the DB Console Metrics graphs to get a high-level understanding of the contention events. The SQL Statement Errors graph show an increase of errors during the time period of 9:16 to 9:23 UTC:

DB Console SQL Statement Errors graph

The SQL Statement Contention graph shows a corresponding increase between 9:16 and 9:23 UTC:

DB Console SQL Statement Contention graph

The Transaction Restarts graph also shows a corresponding increase between 9:16 and 9:23 UTC:

DB Console Transaction Restarts graph

These graphs help to understand the incident at a high-level, but not the specific transactions that are involved. To understand that, query crdb_internal.transaction_contention_events table.

`transaction_contention_events` for support scenario

Query 1 frequency and duration

Determine the frequency, total duration, and average duration of contention events per database, schema, table, and index for the time period. Internal contention events (e.g., to system tables) are excluded by omitting the fingerprints with the pattern '0000000000000000'.

SELECT COUNT(*) as cnt,
  SUM(contention_duration) as duration,                
  SUM(contention_duration)/count(*) as avg_duration,
  database_name, schema_name, table_name, index_name, contention_type
FROM crdb_internal.transaction_contention_events
WHERE collection_ts BETWEEN '2025-02-20 09:16:00' AND '2025-02-20 09:23:00'
  AND encode(blocking_txn_fingerprint_id, 'hex') != '0000000000000000'
  AND encode(waiting_txn_fingerprint_id, 'hex') != '0000000000000000'
GROUP BY database_name, schema_name, table_name, index_name, contention_type
ORDER BY COUNT(*) desc;
  cnt  |    duration     |  avg_duration   | database_name       | schema_name | table_name  | index_name          | contention_type
-------+-----------------+-----------------+---------------------+-------------+-------------+---------------------+------------------------
  4756 | ...             | ...             | Support-permissions | public      | UserOptions | UserOptions+userKey | SERIALIZATION_CONFLICT
     8 | ...             | ...             | Support-permissions | public      | UserOptions | UserOptions+id      | SERIALIZATION_CONFLICT

The results indicate that the main source of the contention is the UserOptions+userKey index on the UserOptions table.

Examine the table schema for UserOptions. Notice that the UserOptions+userKey index is a secondary unique index on the userKey column.

SHOW CREATE UserOptions;
  table_name | create_statement
-------------+-------------------------------------------------------------
"UserOptions"| CREATE TABLE public."UserOptions" (
             |     id INT8 NOT NULL DEFAULT unordered_unique_rowid(),
             |     "updatedUtc" INT8 NOT NULL,
             |     "userKey" STRING(40) NOT NULL,
             |     ...
             |     CONSTRAINT "UserOptions+id" PRIMARY KEY (id ASC),
             |     UNIQUE INDEX "UserOptions+userKey" ("userKey" ASC),
             |     ...
             | )

Query 2 specific `contending_key`

Determine which specific keys were involved in the contention events on that index:

SELECT contending_key, crdb_internal.pretty_key(contending_key, 0) as pretty_key, count(*) 
FROM crdb_internal.transaction_contention_events
WHERE collection_ts BETWEEN '2025-02-20 09:16:00' AND '2025-02-20 09:23:00'
  AND database_name = 'Support-permissions' AND table_name = 'UserOptions' AND index_name = 'UserOptions+userKey'
GROUP BY contending_key, crdb_internal.pretty_key(contending_key, 0)
ORDER BY count(*) DESC;

                contending_key                |        pretty_key         | count
-----------------------------------------------+---------------------------+--------
  \xf67f8a123739635a316b6a64316f303d2d31000188 | /127/2/"abcdefghijklmn"/0 |  4756

The results indicate that a single key ("abcdefghijklmn") accounts for all of the contention events during the incident time period.

Query 3 blocking and waiting transaction fingerprint IDs

Next, determine which specific transactions and statements were involved in the contention events. Modify the query by updating the database_name, table_name, and index_name. Increase the limit to 5.

With x AS (
SELECT COUNT(*) AS cnt, 
  SUM(contention_duration) as duration,
  SUM(contention_duration)/count(*) as avg_duration,
  blocking_txn_fingerprint_id, waiting_txn_fingerprint_id, waiting_stmt_fingerprint_id,
  database_name, schema_name, table_name, index_name, contention_type
FROM crdb_internal.transaction_contention_events
WHERE collection_ts BETWEEN '2025-02-20 09:16:00' AND '2025-02-20 09:23:00'
  AND encode(blocking_txn_fingerprint_id, 'hex') != '0000000000000000'
  AND encode(waiting_txn_fingerprint_id, 'hex') != '0000000000000000'
  AND database_name = 'Support-permissions' AND table_name = 'UserOptions' AND index_name = 'UserOptions+userKey'
GROUP BY blocking_txn_fingerprint_id, waiting_txn_fingerprint_id, waiting_stmt_fingerprint_id, database_name, schema_name, table_name, index_name, contention_type
) 
SELECT row_number() OVER (), *
  FROM (
        SELECT cnt, blocking_txn_fingerprint_id, waiting_txn_fingerprint_id, waiting_stmt_fingerprint_id
        FROM x ORDER BY cnt DESC LIMIT 5
  );
  row_number | cnt  | blocking_txn_fingerprint_id | waiting_txn_fingerprint_id | waiting_stmt_fingerprint_id
-------------+------+-----------------------------+----------------------------+------------------------------
           1 | 1331 | \x9b06dfa27c208be3          | \x9b06dfa27c208be3         | \x346562eefa213c3c
           2 | 1164 | \x9b06f0a27c20af50          | \x9b06dfa27c208be3         | \x346562eefa213c3c
           3 | 1115 | \x9b06dfa27c208be3          | \x9b06f0a27c20af50         | \x346562eefa213c3c
           4 |  968 | \x9b06f0a27c20af50          | \x9b06f0a27c20af50         | \x346562eefa213c3c
           5 |   52 | \xcbf29ce484222325          | \x9b06dfa27c208be3         | \x346562eefa213c3c

The results indicate that the top 4 rows, which accounted for the vast majority of the contention events, all involved two blocking transaction fingerprints: \x9b06dfa27c208be3 and \x9b06f0a27c20af50. There is a single statement fingerprint that does all of the waiting: \x346562eefa213c3c.

`transaction_statistics` and `statement_statistics`

To examine the statements associated with the two transaction fingerprints, query the crdb_internal.transaction_statistics and crdb_internal.statement_statistics tables.

Query 4 row 1 blocking statement fingerprint IDs

SELECT fingerprint_id as blocking_txn_fingerprint_id, app_name, metadata->>'stmtFingerprintIDs' AS blocking_stmt_fingerprint_ids
FROM crdb_internal.transaction_statistics
WHERE fingerprint_id = '\x9b06dfa27c208be3' LIMIT 1;
  blocking_txn_fingerprint_id | app_name        | blocking_stmt_fingerprint_ids
------------------------------+-----------------+-------------------------------
  \x9b06dfa27c208be3          | permissions-api | ["346562eefa213c3c"]

Query 5 row 1 blocking statement metadata

Instead of selecting the metadata->>'query' key as in Query 5 blocking statement SQL, select the whole JSONB metadata object.

SELECT fingerprint_id as blocking_stmt_fingerprint_id, app_name, jsonb_pretty(metadata) AS metadata
FROM crdb_internal.statement_statistics
WHERE transaction_fingerprint_id='\x9b06dfa27c208be3' and fingerprint_id in ('\x346562eefa213c3c')
ORDER BY fingerprint_id;
    fingerprint_id   |       app_name        | metadata
---------------------+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  \x346562eefa213c3c | permissions-api       | {
                     |                       |     "db": "Support-permissions",
                     |                       |     "distsql": false,
                     |                       |     "failed": true,
                     |                       |     "fullScan": false,
                     |                       |     "implicitTxn": true,
                     |                       |     "query": "INSERT INTO \"UserOptions\"(\"userKey\", \"updatedUtc\") VALUES ($1, $1) ON CONFLICT (\"userKey\") DO UPDATE SET \"updatedUtc\" = excluded.\"updatedUtc\" ...
                     |                       |     "querySummary": "INSERT INTO \"UserOptions\".(\"userKey\", \"updatedUtc\")",
                     |                       |     "stmtType": "TypeDML",
                     |                       |     "vec": false
                     |                       | }

Note that in metadata, the key "failed" has the value of true.

Query 6 row 2 blocking statement fingerprint IDs

SELECT fingerprint_id as blocking_txn_fingerprint_id, app_name, metadata->>'stmtFingerprintIDs' AS blocking_stmt_fingerprint_ids
FROM crdb_internal.transaction_statistics
WHERE fingerprint_id = '\x9b06f0a27c20af50' LIMIT 1;
  blocking_txn_fingerprint_id | app_name        | blocking_stmt_fingerprint_ids
------------------------------+-----------------+-------------------------------
  \x9b06f0a27c20af50          | permissions-api | ["x34654deefa21188f"]

Query 7 row 2 blocking statement metadata

SELECT fingerprint_id as blocking_stmt_fingerprint_id, app_name, jsonb_pretty(metadata) AS metadata
FROM crdb_internal.statement_statistics
WHERE transaction_fingerprint_id='\x9b06f0a27c20af50' and fingerprint_id in ('\x34654deefa21188f')
ORDER BY fingerprint_id;
    fingerprint_id   |       app_name        | metadata
---------------------+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  \x34654deefa21188f | permissions-api       | {
                     |                       |     "db": "Support-permissions",
                     |                       |     "distsql": false,
                     |                       |     "failed": false,
                     |                       |     "fullScan": false,
                     |                       |     "implicitTxn": true,
                     |                       |     "query": "INSERT INTO \"UserOptions\"(\"userKey\", \"updatedUtc\") VALUES ($1, $1) ON CONFLICT (\"userKey\") DO UPDATE SET \"updatedUtc\" = excluded.\"updatedUtc\" ...
                     |                       |     "querySummary": "INSERT INTO \"UserOptions\".(\"userKey\", \"updatedUtc\")",
                     |                       |     "stmtType": "TypeDML",
                     |                       |     "vec": false
                     |                       | }

Note that in metadata, the key "failed" has the value of false.

Transactions from row 1 and row 2 are both the INSERT or UPDATE operations with the first having failed executions ("failed": true) and the second successful ("failed": false).

From this analysis, we can conclude that a large number of INSERT or UPDATE operations were operating on the same userKey ("abcdefghijklmn") in the UserOptions@UserOptions+userKey index during the time period of the incident.

The next steps would be to investigate what on the application side may have caused this large number of updates on a single key.

Analyze using Insights page

While the previous analysis process uses the crdb_internal tables, the performance tuning tutorial Troubleshoot Lock Contention gives basic examples to understand lock contention and how to analyze it using the Insights page.

Pricing

Contact us

Sign In

Monitor and Analyze Transaction Contention

Monitor using console

SQL Statement Contention graph

SQL Activity

Statements page

Transactions page

Insights page

Monitor using metrics

Monitor using `crdb_internal` tables

`transaction_contention_events` table

Other tables

Continuous monitoring

Analyze using `crdb_internal` tables

Run `insights` workload

`transaction_contention_events`

Query 1 time period

Query 2 frequency and duration

Query 3 blocking and waiting transaction fingerprint IDs

`transaction_statistics`

Query 4 blocking statement fingerprint IDs

`statement_statistics`

Query 5 blocking statement SQL

Query 6 waiting statement SQL

Analysis of support scenario

`transaction_contention_events` for support scenario

Query 1 frequency and duration

Query 2 specific `contending_key`

Query 3 blocking and waiting transaction fingerprint IDs

`transaction_statistics` and `statement_statistics`

Query 4 row 1 blocking statement fingerprint IDs

Query 5 row 1 blocking statement metadata

Query 6 row 2 blocking statement fingerprint IDs

Query 7 row 2 blocking statement metadata

Analyze using Insights page

See Also

Monitor and Analyze Transaction Contention

Monitor using console

SQL Statement Contention graph

SQL Activity

Statements page

Transactions page

Insights page

Monitor using metrics

Monitor using crdb_internal tables

transaction_contention_events table

Other tables

Continuous monitoring

Analyze using crdb_internal tables

Run insights workload

transaction_contention_events

Query 1 time period

Query 2 frequency and duration

Query 3 blocking and waiting transaction fingerprint IDs

transaction_statistics

Query 4 blocking statement fingerprint IDs

statement_statistics

Query 5 blocking statement SQL

Query 6 waiting statement SQL

Analysis of support scenario

transaction_contention_events for support scenario

Query 1 frequency and duration

Query 2 specific contending_key

Query 3 blocking and waiting transaction fingerprint IDs

transaction_statistics and statement_statistics

Query 4 row 1 blocking statement fingerprint IDs

Query 5 row 1 blocking statement metadata

Query 6 row 2 blocking statement fingerprint IDs

Query 7 row 2 blocking statement metadata

Analyze using Insights page

See Also

Monitor using `crdb_internal` tables

`transaction_contention_events` table

Analyze using `crdb_internal` tables

Run `insights` workload

`transaction_contention_events`

`transaction_statistics`

`statement_statistics`

`transaction_contention_events` for support scenario

Query 2 specific `contending_key`

`transaction_statistics` and `statement_statistics`