Duplicate records in Drupal can cause significant issues, particularly when they lead to integrity constraint violations in the database. These errors often occur due to duplicate UUIDs in the `node` table, which can result from programmatic imports, migrations, or unintended database inconsistencies. This guide provides a structured approach to identifying and resolving duplicate records in Drupal.
Identifying duplicate records
Before attempting to delete or update any records, developers should first identify duplicate entries. Running the following SQL query in the database will return a list of UUIDs that appear more than once:
SELECT uuid, COUNT(*) as count
FROM node_field_data
GROUP BY uuid
HAVING COUNT(*) > 1;
This query helps pinpoint the UUIDs that have been duplicated. To get the corresponding node IDs (`nid`) associated with these UUIDs, use:
SELECT nid, uuid
FROM node_field_data
WHERE uuid IN (
SELECT uuid
FROM node_field_data
GROUP BY uuid
HAVING COUNT(*) > 1
)
ORDER BY uuid;
This will provide a detailed list of the affected records.
Removing duplicate records
Once the duplicate records have been identified, developers must carefully decide which ones to remove. To delete one duplicate per UUID, the following SQL query can be used:
DELETE FROM node
WHERE nid IN (
SELECT nid FROM (
SELECT nid FROM node_field_data
WHERE uuid IN (
SELECT uuid
FROM node_field_data
GROUP BY uuid
HAVING COUNT(*) > 1
)
ORDER BY nid DESC
LIMIT 1
) AS duplicates
);
Caution: Before executing any deletion queries, it is crucial to back up the database to prevent unintended data loss.
Clearing the cache and rebuilding Drupal
After deleting duplicate records, the site’s cache should be cleared to ensure data consistency. This can be done using Drush:
drush cr
Clearing the cache ensures that Drupal properly registers the changes made to the database.
Alternative approach using Drupal Console
For those using Drupal Console, an alternative approach is to remove nodes by UUID using:
drush entity:delete node --uuid="fdd5be36-72e8-4fd0-bc14-6bbaf631be89"
Replacing the UUID with the affected record’s UUID will remove the duplicate entry cleanly.
Preventing duplicate records in the future
To avoid similar issues in the future, developers should:
- Implement UUID checks before inserting new nodes
- Use `entity_load_by_uuid` to prevent duplication
- Enable `update_existing: true` in migrations to update existing nodes rather than creating duplicates.
By following these best practices, you can maintain database integrity and prevent future errors related to duplicate records in Drupal. Ensuring clean data structures will help keep Drupal-based applications running smoothly and efficiently.