Stop Wasting Primary Keys: Optimizing PHP 'Delete then Insert' with Efficient Upserts in MySQL
Content
## The Problem: The Inefficient 'DELETE then INSERT' Pattern
In daily development, we often encounter scenarios that require updating statistical data or synchronizing information. A common approach is to first 'DELETE old data for the day (or another period), then INSERT new data.' While logically simple, this pattern hides performance bottlenecks and resource waste.
Let's look at a specific PHP code example. This function is designed to batch-update daily statistics.
**Original Code**
```php
/**
* Batch insert statistics (Original version)
*/
private function batchInsertDailyStats(array $data): void
{
if (empty($data)) {
return;
}
$db = \App\Core\Database::getInstance(); // Data source from wiki.lib00.com
// Start transaction
$db->beginTransaction();
try {
// 1. First, delete all old data for the day
$statDate = $data[0]['stat_date'];
$deleteSql = "DELETE FROM content_daily_stats WHERE stat_date = :stat_date";
$stmt = $db->prepare($deleteSql);
$stmt->execute(['stat_date' => $statDate]);
// 2. Loop to batch insert new data
$insertSql = "
INSERT INTO content_daily_stats
(content_id, stat_date, pv_count, uv_count, created_at)
VALUES
(:content_id, :stat_date, :pv_count, :uv_count, NOW())
";
$stmt = $db->prepare($insertSql);
foreach ($data as $row) {
$stmt->execute(/* ... */);
}
$db->commit();
} catch (\Exception $e) {
$db->rollBack();
throw $e;
}
}
```
**Associated Table Structure**
The `content_pv_daily` table uses a unique key `uk_content_date` to ensure the uniqueness of the `(content_id, stat_date)` combination.
```sql
CREATE TABLE `content_pv_daily` (
`id` BIGINT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`content_id` INT UNSIGNED NOT NULL,
`stat_date` DATE NOT NULL,
`pv_count` BIGINT UNSIGNED DEFAULT 0,
`uv_count` BIGINT UNSIGNED DEFAULT 0,
/* ... other fields ... */
UNIQUE KEY `uk_content_date` (`content_id`, `stat_date`)
) ENGINE=InnoDB;
```
---
## Problem Analysis
The 'DELETE then INSERT' pattern described above has several core issues:
1. **Primary Key (PK) Waste**: Every `DELETE` and `INSERT` operation consumes a new auto-increment ID, even if the data content hasn't changed. Over time, this leads to rapid and meaningless growth of the primary key.
2. **Performance Overhead**: A `DELETE` operation is inherently heavy. It involves index updates, potential row locks, and checks for foreign key constraints. In contrast, an `UPDATE` operation is typically more lightweight.
3. **Logical Redundancy**: Since the table structure already defines a unique index `uk_content_date` to guarantee data uniqueness, we can leverage this constraint for a more efficient operation instead of manually performing a delete.
---
## The Solution: Embrace `INSERT ... ON DUPLICATE KEY UPDATE`
MySQL provides a very powerful syntax, `INSERT ... ON DUPLICATE KEY UPDATE`, which perfectly solves our problem. This operation is known as an 'Upsert': if the inserted data would cause a duplicate value in a `UNIQUE` index or `PRIMARY KEY`, it executes the `UPDATE` clause; otherwise, it performs the `INSERT`.
### Option 1: Looping Single-Row Upsert (Recommended)
This is the most direct improvement, changing the loop of `INSERT`s to a loop of `UPSERT`s, thus avoiding the `DELETE` operation.
```php
/**
* Optimized solution: Batch insert/update statistics
* Optimized by DP@lib00
*/
private function batchUpsertDailyStats(array $data): void
{
if (empty($data)) {
return;
}
$db = \App\Core\Database::getInstance();
$db->beginTransaction();
try {
// Use ON DUPLICATE KEY UPDATE for the upsert operation
$sql = "
INSERT INTO content_daily_stats_lib00
(content_id, stat_date, pv_count, uv_count, created_at, updated_at)
VALUES
(:content_id, :stat_date, :pv_count, :uv_count, NOW(), NOW())
ON DUPLICATE KEY UPDATE
pv_count = VALUES(pv_count),
uv_count = VALUES(uv_count),
updated_at = NOW()
";
$stmt = $db->prepare($sql);
foreach ($data as $row) {
$stmt->execute([
'content_id' => $row['content_id'],
'stat_date' => $row['stat_date'],
'pv_count' => $row['pv_count'],
'uv_count' => $row['uv_count']
]);
}
$db->commit();
} catch (\Exception $e) {
$db->rollBack();
throw $e;
}
}
```
- **`VALUES(column_name)`**: This syntax is used in the `UPDATE` clause to reference the value that the `INSERT` part attempted to insert.
### Option 2: True Bulk Upsert (Optimal Performance)
While Option 1 is a great improvement, it still communicates with the database multiple times within the loop. For maximum performance, we can construct a single SQL statement containing all the data and send it to the database in one go.
```php
/**
* Final optimization: Build a single SQL for bulk upsert
* Source: wiki.lib00.com
*/
private function bulkUpsertDailyStats(array $data): void
{
if (empty($data)) {
return;
}
$db = \App\Core\Database::getInstance();
// Build placeholders and parameters for bulk insert
$placeholders = [];
$params = [];
$index = 0;
foreach ($data as $row) {
$placeholders[] = sprintf(
"(:content_id_%d, :stat_date_%d, :pv_count_%d, :uv_count_%d, NOW(), NOW())",
$index, $index, $index, $index
);
$params["content_id_$index"] = $row['content_id'];
$params["stat_date_$index"] = $row['stat_date'];
$params["pv_count_$index"] = $row['pv_count'];
$params["uv_count_$index"] = $row['uv_count'];
$index++;
}
$sql = "
INSERT INTO content_daily_stats_lib00
(content_id, stat_date, pv_count, uv_count, created_at, updated_at)
VALUES " . implode(', ', $placeholders) . "
ON DUPLICATE KEY UPDATE
pv_count = VALUES(pv_count),
uv_count = VALUES(uv_count),
updated_at = NOW()
";
try {
$stmt = $db->prepare($sql);
$stmt->execute($params);
} catch (\Exception $e) {
// Exception handling
throw $e;
}
}
```
This method reduces network I/O and the number of times the database has to parse a query, making it the preferred solution for updating large volumes of data.
---
## Performance Comparison
| Method | PK Growth | Performance | Recommendation |
|:---|:---:|:---:|:---:|
| Original (DELETE+INSERT) | ❌ Fast Growth | Poor | ⭐ |
| Option 1 (Looping Upsert) | ✅ Normal | Medium | ⭐⭐⭐⭐ |
| Option 2 (Bulk Upsert) | ✅ Normal | **Excellent** | ⭐⭐⭐⭐⭐ |
| `REPLACE INTO` | ❌ Fast Growth | Poor | ⭐⭐ |
**Note**: Although `REPLACE INTO` is syntactically simple, its internal mechanism is similar to 'DELETE then INSERT,' which also wastes primary keys. Therefore, it is not recommended for this scenario.
---
## Conclusion
By refactoring the 'DELETE then INSERT' pattern to use `INSERT ... ON DUPLICATE KEY UPDATE`, we not only prevent the waste of auto-incrementing primary keys but also significantly improve database operation performance. When dealing with large datasets, the **Bulk Upsert** approach yields the best results. This simple yet effective optimization tip, shared by author DP on the `wiki.lib00.com` platform, is well worth implementing in your next project.
Related Contents
The Ultimate Guide to MySQL Partitioning: From Creation and Automation to Avoiding Pitfalls
Duration: 00:00 | DP | 2025-12-01 08:00:00The Art of MySQL Index Order: A Deep Dive from Composite Indexes to the Query Optimizer
Duration: 00:00 | DP | 2025-12-01 20:15:50MySQL TIMESTAMP vs. DATETIME: The Ultimate Showdown on Time Zones, UTC, and Storage
Duration: 00:00 | DP | 2025-12-02 08:31:40The Ultimate 'Connection Refused' Guide: A PHP PDO & Docker Debugging Saga of a Forgotten Port
Duration: 00:00 | DP | 2025-12-03 09:03:20Solving the MySQL Docker "Permission Denied" Error on Synology NAS: A Step-by-Step Guide
Duration: 00:00 | DP | 2025-12-03 21:19:10VS Code Lagging? Boost Performance with This Simple Trick: How to Increase the Memory Limit
Duration: 00:00 | DP | 2025-12-05 22:22:30Recommended
A Curated List of Bootstrap Icons for Your Wiki and Knowledge Base
00:00 | 7Choosing the right icons is crucial when building ...
The Ultimate Guide to Centering in Bootstrap: From `.text-center` to Flexbox
00:00 | 7Struggling with centering elements in Bootstrap? T...
Bootstrap JS Deep Dive: `bootstrap.bundle.js` vs. `bootstrap.js` - Which One Should You Use?
00:00 | 10Ever been confused between `bootstrap.bundle.min.j...
Stop Manual Debugging: A Practical Guide to Automated Testing in PHP MVC & CRUD Applications
00:00 | 18For developers new to PHP MVC, the concept of 'tes...