头文件
1 2 3 4
| #include<iostream> #include<vector> #include<fstream> using namespace std;
|
预定义
在文件分块和合并的过程中,可能我们操作的文件大小有数十GB,超过了我们电脑内存的大小,此时我们需要将文件逐步读入缓冲区,然后再进行操作,避免文件大小过大挤爆内存。
1
| constexpr size_t BUFFER_SIZE = 1024 * 1024;
|
单线程
文件分块
文件分块流程如下所示
- 获取文件大小
- 计算每个分块的大小
- 对文件进行切分并通过二进制文件输出流输出到目标文件中
- 分块结束
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| void SplitFile(fstream* input, vector<string>&files) { if(files.size() <= 1) throw "name.size() <= 1"; size_t size = 0; input->seekg(0, ios::end); size = input->tellg(); input->seekg(0, ios::beg);
size_t part_size = size / files.size(); size_t last_size = size - part_size * (files.size() - 1);
char* buffer = new char[BUFFER_SIZE]; for (size_t i = 0; i < files.size(); i++) { fstream output(files[i], ios::out | ios::binary); size_t cur_size = (i == files.size() - 1) ? last_size : part_size; if (!output.is_open()) throw "output file open failed";
while (cur_size > 0) { size_t read_size = cur_size > BUFFER_SIZE ? BUFFER_SIZE : cur_size; input->read(buffer, read_size); output.write(buffer, read_size); cur_size -= read_size; } output.close(); } delete[] buffer; }
|
文件合并
文件合并流程如下所示
- 对于将被合并的文件,通过fstream来获取文件二进制输入流
- 将文件逐步读入缓冲区
- 将缓冲区的文件输出
- 合并结束
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| void MergeFile(fstream* output, vector<string>files) { char* buffer = new char[BUFFER_SIZE]; for (size_t i = 0; i < files.size(); i++) { fstream input(files[i], ios::in | ios::binary); if (!input.is_open()) throw "input file open failed"; input.seekg(0, ios::end); size_t size = input.tellg(); input.seekg(0, ios::beg); while (size > 0) { size_t read_size = size > BUFFER_SIZE ? BUFFER_SIZE : size; input.read(buffer, read_size); output->write(buffer, read_size); size -= read_size; } input.close(); } delete[] buffer; }
|
测试代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| int main() { fstream file("C:/Users/24761/Desktop/test/input.zip", ios::out | ios::binary | ios::in); if (!file.is_open()) throw "input file open failed";
vector<string>files = { "C:/Users/24761/Desktop/test/output1.zip","C:/Users/24761/Desktop/test/output2.zip","C:/Users/24761/Desktop/test/output3.zip" }; SplitFile(&file, files); file.seekg(0, ios::beg); MergeFile(&file, files); }
|