Performance Scalability Post:
Symptom:
Slowness observed. "node select queries are taking up majority of time resulting in slowness of the application"
in PDOStatement::execute called at /core/lib/Drupal/Core/Database/StatementWrapper.php (145)...Core\Database\StatementWrapper::execute called at /core/lib/Drupal/Core/Database/Connection.php (937)...Database\Connection::query called at /core/lib/Drupal/Core/Database/Driver/pgsql/Connection.php (183)...
These queries are taking longer than 5 seconds.
SELECT "base_table"."vid" AS "vid", "base_table"."nid" AS "nid" + FROM + "node" "base_table" + INNER JOIN "node" "node" ON "node"."nid" = "base_table"."nid" + INNER JOIN "node_field_data" "node_field_data" ON "node_field_data"."nid" = "base_table"."nid" + WHERE ((LOWER("node"."uuid") IN (LOWER('$1')))) AND ("node_field_data"."default_langcode" IN ('$2'))
Analysis:
Repeat the issue locally:
Edit postgresql.conf
log_statement = 'none' log_min_duration_statement = 5000
I also set `log_statement = 'all'` initally but that creates a huge file which you should be careful of not to overload your system. Though it can be helpful to see everything.
Otherwise setting `log_min_duration_statement = 5000` means only log queries that take over 5 seconds.
Then there was some time figuring out how to replicate the issue. But eventually I found it was related to saving an article based on some custom event we have.
The above codeblock points to the execute() function in Driver/pgsql
the execute() function uses the compile() function which uses translateCondition()
in translateCondition() we are seeing the `['where'] = 'LOWER('` . section which we can see in the slow logs. We saw this same function with similar behavior existing in Drupal 8. But for some reason it was not a performance issue then. Now with Drupal 9 the performance impact is now more visible.
public static function translateCondition(&$condition, SelectInterface $sql_query, $case_sensitive) { if (is_array($condition['value']) && $case_sensitive === FALSE) { $condition['where'] = 'LOWER(' . $sql_query->escapeField($condition['real_field']) . ') ' . $condition['operator'] . ' ('; $condition['where_args'] = []; // Only use the array values in case an associative array is passed as an // argument following similar pattern in // \Drupal\Core\Database\Connection::expandArguments(). $where_prefix = str_replace('.', '_', $condition['real_field']); foreach ($condition['value'] as $key => $value) { $where_id = $where_prefix . $key; $condition['where'] .= 'LOWER(:' . $where_id . '),'; $condition['where_args'][':' . $where_id] = $value; } $condition['where'] = trim($condition['where'], ','); $condition['where'] .= ')'; } parent::translateCondition($condition, $sql_query, $case_sensitive); } }
In general we likely would not want to change the Core execute() function or the translateCondition() function since this is good behavior most of the time.
Though it is odd that "PostgreSQL is a case-sensitive database by default" and yet the default ends up fitting into the condition `$case_sensitive === FALSE`. Potentially core should be updated. But this should likely be a discussion with the maintainers.
['where'] = 'LOWER(' . is part of:
translateCondition() function which is called in core/lib/Drupal/Core/Entity/Query/Sql/pgsql/Condition.php
New Relic:
In New Relic we can see `Postgres node select` was consuming 84% of the db operations.
Cluster Control:
In ClusterControl we can see the top slowest Query was the `SELECT "base_table"."vid" AS` as seen above and it had a near 3 million count in just a few days.
Fix:
Alter the query that uses LOWER for the condition that matches 'LOWER("node"."uuid")
/** * Implements hook_query_TAG_alter(). * * @param \Drupal\Core\Database\Query\AlterableInterface $query */ function core_query_entity_query_node_alter(Drupal\Core\Database\Query\AlterableInterface $query) { $conditions = &$query->conditions(); if (isset($conditions[0]['field']) && $conditions[0]['field'] == 'LOWER("node"."uuid") IN (LOWER(:node_uuid0))') { $conditions[0]['field'] = '"node"."uuid" = :node_uuid0'; } }
Now after the fix, the LOWER still hits all other queries it normally would hit, except for the condition of node uuid
We now see it as :
SELECT "base_table"."vid" AS "vid", "base_table"."nid" AS "nid" + FROM + "node" "base_table" + INNER JOIN "node" "node" ON "node"."nid" = "base_table"."nid" + INNER JOIN "node_field_data" "node_field_data" ON "node_field_data"."nid" = "base_table"."nid" + WHERE ((("node"."uuid" = '$1'))) AND ("node_field_data"."default_langcode" IN ('$2'))












