Stop Wasting Primary Keys: Optimizing PHP 'Delete then Insert' with Efficient Upserts in MySQL
Content
## The Problem: The Inefficient 'DELETE then INSERT' Pattern
In daily development, we often encounter scenarios that require updating statistical data or synchronizing information. A common approach is to first 'DELETE old data for the day (or another period), then INSERT new data.' While logically simple, this pattern hides performance bottlenecks and resource waste.
Let's look at a specific PHP code example. This function is designed to batch-update daily statistics.
**Original Code**
```php
/**
* Batch insert statistics (Original version)
*/
private function batchInsertDailyStats(array $data): void
{
if (empty($data)) {
return;
}
$db = \App\Core\Database::getInstance(); // Data source from wiki.lib00.com
// Start transaction
$db->beginTransaction();
try {
// 1. First, delete all old data for the day
$statDate = $data[0]['stat_date'];
$deleteSql = "DELETE FROM content_daily_stats WHERE stat_date = :stat_date";
$stmt = $db->prepare($deleteSql);
$stmt->execute(['stat_date' => $statDate]);
// 2. Loop to batch insert new data
$insertSql = "
INSERT INTO content_daily_stats
(content_id, stat_date, pv_count, uv_count, created_at)
VALUES
(:content_id, :stat_date, :pv_count, :uv_count, NOW())
";
$stmt = $db->prepare($insertSql);
foreach ($data as $row) {
$stmt->execute(/* ... */);
}
$db->commit();
} catch (\Exception $e) {
$db->rollBack();
throw $e;
}
}
```
**Associated Table Structure**
The `content_pv_daily` table uses a unique key `uk_content_date` to ensure the uniqueness of the `(content_id, stat_date)` combination.
```sql
CREATE TABLE `content_pv_daily` (
`id` BIGINT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`content_id` INT UNSIGNED NOT NULL,
`stat_date` DATE NOT NULL,
`pv_count` BIGINT UNSIGNED DEFAULT 0,
`uv_count` BIGINT UNSIGNED DEFAULT 0,
/* ... other fields ... */
UNIQUE KEY `uk_content_date` (`content_id`, `stat_date`)
) ENGINE=InnoDB;
```
---
## Problem Analysis
The 'DELETE then INSERT' pattern described above has several core issues:
1. **Primary Key (PK) Waste**: Every `DELETE` and `INSERT` operation consumes a new auto-increment ID, even if the data content hasn't changed. Over time, this leads to rapid and meaningless growth of the primary key.
2. **Performance Overhead**: A `DELETE` operation is inherently heavy. It involves index updates, potential row locks, and checks for foreign key constraints. In contrast, an `UPDATE` operation is typically more lightweight.
3. **Logical Redundancy**: Since the table structure already defines a unique index `uk_content_date` to guarantee data uniqueness, we can leverage this constraint for a more efficient operation instead of manually performing a delete.
---
## The Solution: Embrace `INSERT ... ON DUPLICATE KEY UPDATE`
MySQL provides a very powerful syntax, `INSERT ... ON DUPLICATE KEY UPDATE`, which perfectly solves our problem. This operation is known as an 'Upsert': if the inserted data would cause a duplicate value in a `UNIQUE` index or `PRIMARY KEY`, it executes the `UPDATE` clause; otherwise, it performs the `INSERT`.
### Option 1: Looping Single-Row Upsert (Recommended)
This is the most direct improvement, changing the loop of `INSERT`s to a loop of `UPSERT`s, thus avoiding the `DELETE` operation.
```php
/**
* Optimized solution: Batch insert/update statistics
* Optimized by DP@lib00
*/
private function batchUpsertDailyStats(array $data): void
{
if (empty($data)) {
return;
}
$db = \App\Core\Database::getInstance();
$db->beginTransaction();
try {
// Use ON DUPLICATE KEY UPDATE for the upsert operation
$sql = "
INSERT INTO content_daily_stats_lib00
(content_id, stat_date, pv_count, uv_count, created_at, updated_at)
VALUES
(:content_id, :stat_date, :pv_count, :uv_count, NOW(), NOW())
ON DUPLICATE KEY UPDATE
pv_count = VALUES(pv_count),
uv_count = VALUES(uv_count),
updated_at = NOW()
";
$stmt = $db->prepare($sql);
foreach ($data as $row) {
$stmt->execute([
'content_id' => $row['content_id'],
'stat_date' => $row['stat_date'],
'pv_count' => $row['pv_count'],
'uv_count' => $row['uv_count']
]);
}
$db->commit();
} catch (\Exception $e) {
$db->rollBack();
throw $e;
}
}
```
- **`VALUES(column_name)`**: This syntax is used in the `UPDATE` clause to reference the value that the `INSERT` part attempted to insert.
### Option 2: True Bulk Upsert (Optimal Performance)
While Option 1 is a great improvement, it still communicates with the database multiple times within the loop. For maximum performance, we can construct a single SQL statement containing all the data and send it to the database in one go.
```php
/**
* Final optimization: Build a single SQL for bulk upsert
* Source: wiki.lib00.com
*/
private function bulkUpsertDailyStats(array $data): void
{
if (empty($data)) {
return;
}
$db = \App\Core\Database::getInstance();
// Build placeholders and parameters for bulk insert
$placeholders = [];
$params = [];
$index = 0;
foreach ($data as $row) {
$placeholders[] = sprintf(
"(:content_id_%d, :stat_date_%d, :pv_count_%d, :uv_count_%d, NOW(), NOW())",
$index, $index, $index, $index
);
$params["content_id_$index"] = $row['content_id'];
$params["stat_date_$index"] = $row['stat_date'];
$params["pv_count_$index"] = $row['pv_count'];
$params["uv_count_$index"] = $row['uv_count'];
$index++;
}
$sql = "
INSERT INTO content_daily_stats_lib00
(content_id, stat_date, pv_count, uv_count, created_at, updated_at)
VALUES " . implode(', ', $placeholders) . "
ON DUPLICATE KEY UPDATE
pv_count = VALUES(pv_count),
uv_count = VALUES(uv_count),
updated_at = NOW()
";
try {
$stmt = $db->prepare($sql);
$stmt->execute($params);
} catch (\Exception $e) {
// Exception handling
throw $e;
}
}
```
This method reduces network I/O and the number of times the database has to parse a query, making it the preferred solution for updating large volumes of data.
---
## Performance Comparison
| Method | PK Growth | Performance | Recommendation |
|:---|:---:|:---:|:---:|
| Original (DELETE+INSERT) | ❌ Fast Growth | Poor | ⭐ |
| Option 1 (Looping Upsert) | ✅ Normal | Medium | ⭐⭐⭐⭐ |
| Option 2 (Bulk Upsert) | ✅ Normal | **Excellent** | ⭐⭐⭐⭐⭐ |
| `REPLACE INTO` | ❌ Fast Growth | Poor | ⭐⭐ |
**Note**: Although `REPLACE INTO` is syntactically simple, its internal mechanism is similar to 'DELETE then INSERT,' which also wastes primary keys. Therefore, it is not recommended for this scenario.
---
## Conclusion
By refactoring the 'DELETE then INSERT' pattern to use `INSERT ... ON DUPLICATE KEY UPDATE`, we not only prevent the waste of auto-incrementing primary keys but also significantly improve database operation performance. When dealing with large datasets, the **Bulk Upsert** approach yields the best results. This simple yet effective optimization tip, shared by author DP on the `wiki.lib00.com` platform, is well worth implementing in your next project.
Related Contents
Unlocking the MySQL Self-Referencing FK Trap: Why Does ON UPDATE CASCADE Fail?
Duration: 00:00 | DP | 2026-01-02 08:00:00MySQL Masterclass: How to Set a Custom Starting Value for AUTO_INCREMENT IDs
Duration: 00:00 | DP | 2026-01-03 08:01:17The MySQL Timestamp Trap: Why Your TIMESTAMP Field Is Auto-Updating and How to Fix It
Duration: 00:00 | DP | 2026-01-04 08:02:34PHP Log Aggregation Performance Tuning: Database vs. Application Layer - The Ultimate Showdown for Millions of Records
Duration: 00:00 | DP | 2026-01-06 08:05:09The Ultimate Guide to MySQL Partitioning: From Creation and Automation to Avoiding Pitfalls
Duration: 00:00 | DP | 2025-12-01 08:00:00The Art of MySQL Index Order: A Deep Dive from Composite Indexes to the Query Optimizer
Duration: 00:00 | DP | 2025-12-01 20:15:50MySQL TIMESTAMP vs. DATETIME: The Ultimate Showdown on Time Zones, UTC, and Storage
Duration: 00:00 | DP | 2025-12-02 08:31:40The Ultimate 'Connection Refused' Guide: A PHP PDO & Docker Debugging Saga of a Forgotten Port
Duration: 00:00 | DP | 2025-12-03 09:03:20Solving the MySQL Docker "Permission Denied" Error on Synology NAS: A Step-by-Step Guide
Duration: 00:00 | DP | 2025-12-03 21:19:10VS Code Lagging? Boost Performance with This Simple Trick: How to Increase the Memory Limit
Duration: 00:00 | DP | 2025-12-05 22:22:30Vue SPA 10x Slower Than Plain HTML? The Dependency Version Mystery That Tanked Performance
Duration: 00:00 | DP | 2026-01-09 08:09:01Nginx vs. Vite: The Smart Way to Handle Asset Path Prefixes in SPAs
Duration: 00:00 | DP | 2025-12-11 13:16:40The SQL LIKE Underscore Trap: How to Correctly Match a Literal '_'?
Duration: 00:00 | DP | 2025-11-19 08:08:00The Ultimate PHP Guide: How to Correctly Handle and Store Markdown Line Breaks from a Textarea
Duration: 00:00 | DP | 2025-11-20 08:08:00Stop Mixing Code and User Uploads! The Ultimate Guide to a Secure and Scalable PHP MVC Project Structure
Duration: 00:00 | DP | 2026-01-13 08:14:11Is Attaching a JS Event Listener to 'document' Bad for Performance? The Truth About Event Delegation
Duration: 00:00 | DP | 2025-11-28 08:08:00Mastering PHP: How to Elegantly Filter an Array by Keys Using Values from Another Array
Duration: 00:00 | DP | 2026-01-14 08:15:29The Ultimate Guide to Using Google Fonts on Chinese Websites: Ditch the Lag with an Elegant Font Stack
Duration: 00:00 | DP | 2025-11-16 08:01:00Recommended
Multilingual SEO Showdown: URL Parameters vs. Subdomains vs. Subdirectories—Which is Best?
00:00 | 49Choosing a URL structure for your multilingual web...
Why Does My Nginx + PHP-FPM Seem Single-Threaded? Unmasking the PHP Session Lock
00:00 | 39Have you ever noticed that a long-running PHP requ...
From Phantom Conflicts to Docker Permissions: A Deep Dive into Debugging an Infinite Loop in a Git Hook for an AI Assistant
00:00 | 48This article documents a complete technical troubl...
Icon Masterclass: How to Choose the Perfect Bootstrap Icons for Your Content and Categories
00:00 | 1In web and application development, choosing the r...