Question
How can I backup a specific partition in ClickHouse?
Answer
See the below example, this uses the S3(Minio) disk configuration listed in our docker compose examples page.
This does NOT apply to ClickHouse Cloud
Create a table:
ch_minio_s3 :) CREATE TABLE my_table
(
`event_time` DateTime,
`field_foo` String,
`field_bar` String,
`number` UInt256
)
ENGINE = MergeTree
PARTITION BY number % 2
ORDER BY tuple()
CREATE TABLE my_table
(
`event_time` DateTime,
`field_foo` String,
`field_bar` String,
`number` UInt256
)
ENGINE = MergeTree
PARTITION BY number % 2
ORDER BY tuple()
Query id: a1a54a5a-eac0-477c-b847-b40acaa62780
Ok.
0 rows in set. Elapsed: 0.016 sec.
Add some data that will fill both partitions equally:
ch_minio_s3 :) INSERT INTO my_table SELECT
toDateTime(now() + number) AS event_time,
randomPrintableASCII(10) AS field_foo,
randomPrintableASCII(20) AS field_bar,
number
FROM numbers(1000000)
INSERT INTO my_table SELECT
toDateTime(now() + number) AS event_time,
randomPrintableASCII(10) AS field_foo,
randomPrintableASCII(20) AS field_bar,
number
FROM numbers(1000000)
Query id: bf6ef803-5747-4ea1-ad00-a17967e349b6
Ok.
0 rows in set. Elapsed: 0.282 sec. Processed 1.00 million rows, 8.00 MB (3.55 million rows/s., 28.39 MB/s.)
verify data:
ch_minio_s3 :) SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
Query id: d8febfb0-5339-4f97-aefa-ef0003128526
┌─partition_id─┬─cityHash64(sum(number))─┬──count─┐
│ 0 │ 15460940821314360342 │ 500000 │
│ 1 │ 11827822647069388611 │ 500000 │
└──────────────┴─────────────────────────┴────────┘
2 rows in set. Elapsed: 0.025 sec. Processed 1.00 million rows, 32.00 MB (39.97 million rows/s., 1.28 GB/s.)
backup partition with id 1
to configured s3
disk:
ch_minio_s3 :) BACKUP TABLE my_table PARTITION 1 TO Disk('s3','backups/');
BACKUP TABLE my_table PARTITION 1 TO Disk('s3', 'backups/')
Query id: 810f6144-e282-42e2-99d0-9a80c75a927d
┌─id───────────────────────────────────┬─status─────────┐
│ 4d1da197-c4c9-4b6e-966c-76202eadbd53 │ BACKUP_CREATED │
└──────────────────────────────────────┴────────────────┘
1 row in set. Elapsed: 0.095 sec.
Drop the table:
ch_minio_s3 :) DROP TABLE my_table
DROP TABLE my_table
Query id: c3456044-4689-406e-82ac-8d08b8b618fe
Ok.
0 rows in set. Elapsed: 0.007 sec.
restore just partition with id 1
from backup:
ch_minio_s3 :) RESTORE TABLE my_table PARTITION 1 FROM Disk('s3','backups/');
RESTORE TABLE my_table PARTITION 1 FROM Disk('s3', 'backups/')
Query id: ea306c73-83c5-479f-9c0c-391594facc69
┌─id───────────────────────────────────┬─status───┐
│ ec6841a8-0607-465e-bc4d-d446f960d40a │ RESTORED │
└──────────────────────────────────────┴──────────┘
1 row in set. Elapsed: 0.065 sec.
validate the restored data:
ch_minio_s3 :) SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
Query id: a916176d-6a6e-47fc-ba7d-79bb33b152d8
┌─partition_id─┬─────────────────hash─┬──count─┐
│ 1 │ 11827822647069388611 │ 500000 │
└──────────────┴──────────────────────┴────────┘
1 row in set. Elapsed: 0.012 sec. Processed 500.00 thousand rows, 16.00 MB (41.00 million rows/s., 1.31 GB/s.)