当前位置:网站首页>Microservice architecture | how to solve the problem of fragment uploading of large attachments?
Microservice architecture | how to solve the problem of fragment uploading of large attachments?
2022-06-23 21:11:00 【Code farming architecture】
Reading guide : Patch uploading 、 Breakpoint continuation , These two nouns should not be unfamiliar to friends who have done or are familiar with file uploading , Summarize this article, hoping to be helpful or enlightening to students engaged in related work .
When our files are very large , Does it take a long time to upload , Such a long connection , What if the network fluctuates ? The intermediate network is disconnected ? In such a long process, if there is instability , All the content uploaded this time has failed , Upload again .
Patch uploading , Is to upload the file , According to a certain size , Separate the entire file into blocks ( We call it Part) To upload separately , After uploading, the server will summarize all the uploaded files and integrate them into the original files . Piecemeal upload can not only avoid the problem of always having to upload from the starting position of the file due to the poor network environment , Multithreading can also be used to send different block data concurrently , Improve transmission efficiency , Reduce sending time .
One 、 background
After the sudden increase in the number of system users , In order to better adapt to the customized needs of various groups . Business support is slowly realized C End user defined layout and configuration , Causes configuration data to be read IO a surge .
To better optimize such scenarios , Manage user-defined configuration statically ! The configuration file that will be generated is the static configuration file , There are thorny problems in the process of generating static files , The configuration file is too large, resulting in long waiting time in the file upload server , As a result, the overall performance of the whole business scenario declines .
Two 、 Generate configuration files
Three elements of generating files
- file name
- The contents of the document
- File store Format
The contents of the document 、 File storage formats are easy to understand and handle , Of course, I've sorted out the encryption methods commonly used in microservices
- Microservice architecture | What are the common encryption methods for microservices ( One )
- Microservice architecture | What are the common encryption methods for data encryption ( Two )
Here is a supplementary explanation , If you want to encrypt the file content, you can consider . However, the case scenario in this paper has a low degree of confidentiality for the configuration information , There's no expansion here .
The naming criteria for file names are determined in combination with business scenarios , It's usually based on a profile + Timestamp format is the main format . However, such naming conventions can easily lead to file name conflicts , Cause unnecessary follow-up trouble .
So I made a special treatment for the naming of file names here , I have handled the front end Route Routing experience should be associated with , The filename can be accessed through Content based generation Hash value Instead of .
stay Spring 3.0 Then the method of calculating the summary is provided .
DigestUtils#md
Returns the of the given byte MD5 Hexadecimal string representation of the abstract .
md5DigestAsHex Source code
/**
* Calculate the bytes of the summary
* @param A hexadecimal summary character
* @return String returns the of a given byte MD5 Hexadecimal string representation of the abstract .
*/
public static String md5DigestAsHex(byte[] bytes) {
return digestAsHexString(MD5_ALGORITHM_NAME, bytes);
}file name 、 Content 、 suffix ( Storage format ) Generate the file directly after confirmation
/**
* Generate... Directly from content file
*/
public static void generateFile(String destDirPath, String fileName, String content) throws FileZipException {
File targetFile = new File(destDirPath + File.separator + fileName);
// Ensure that the parent directory exists
if (!targetFile.getParentFile().exists()) {
if (!targetFile.getParentFile().mkdirs()) {
throw new FileZipException(" path is not found ");
}
}
// Set file encoding format
try (PrintWriter writer = new PrintWriter(new BufferedWriter(new OutputStreamWriter(new FileOutputStream(targetFile), ENCODING)))
) {
writer.write(content);
return;
} catch (Exception e) {
throw new FileZipException("create file error",e);
}
}The advantages of generating files through content are self-evident , It can greatly reduce our initiative to generate new files based on content comparison 、 If the file content is large and the corresponding file name is the same, it means that the content has not been adjusted , At this time, we don't need to do subsequent file update operations .
3、 ... and 、 Upload attachments in pieces
The so-called fragment upload , Is to upload the file , According to a certain size , Separate the entire file into blocks ( We call it Part) To upload separately , After uploading, the server will summarize all the uploaded files and integrate them into the original files . Piecemeal upload can not only avoid the problem of always having to upload from the starting position of the file due to the poor network environment , Multithreading can also be used to send different block data concurrently , Improve transmission efficiency , Reduce sending time .
Fragment upload is mainly applicable to the following scenarios :
- The network environment is not good : When the upload fails , You can deal with failed Part Make an independent retry , There is no need to upload other Part.
- Breakpoint continuation : After a pause , It can be uploaded from the last time Part Continue to upload .
- Speed up upload : To upload to OSS When your local file is large , You can upload multiple files in parallel Part To speed up upload .
- Stream upload : You can start uploading when the size of the file you want to upload is uncertain . This kind of scene is quite common in video surveillance and other industry applications .
- Larger files : Generally, when the file is relatively large , By default, fragment upload is generally adopted .
The whole process of fragment upload is roughly as follows :
- Will need to upload the file according to certain segmentation rules , Split into blocks of the same size ;
- Initialize a fragment upload task , Return to the unique identification of this fragment upload ;
- Follow a certain strategy ( Serial or parallel ) Send each piece of data block ;
- After sending , The server judges whether the data upload is complete according to the data , If it's complete , Then the data block is synthesized to get the original file
▐ Define the fragment rule size
By default, files are used to achieve 20MB Perform forced segmentation
/** * Force fragment file size (20MB) */ long FORCE_SLICE_FILE_SIZE = 20L* 1024 * 1024;
For the convenience of debugging , Force the threshold of fragmented files to be adjusted to 1KB
▐ Define fragment upload objects
As shown in the figure above, the file fragment with red serial number , Define the basic attributes of the fragment upload object, including the attachment file name 、 Original file size 、 The original document MD5 value 、 Total number of segments 、 Each slice size 、 Current slice size 、 Current slice serial number, etc
The basis of definition is to facilitate the reasonable division of documents in the future 、 Business expansion such as slice merger , Of course, you can define expansion attributes according to business scenarios .
- Total number of segments
long totalSlices = fileSize % forceSliceSize == 0 ?
fileSize / forceSliceSize : fileSize / forceSliceSize + 1;- Each slice size
long eachSize = fileSize % totalSlices == 0 ?
fileSize / totalSlices : fileSize / totalSlices + 1;- Of the original document MD5 value
MD5Util.hex(file)
Such as :
The current attachment size is :3382KB, The forced partition size is limited to 1024KB
By the above calculation : The number of slices is 4 individual , The size of each slice is 846KB
▐ Read the data bytes of each partition
Mark the current byte subscript , Cyclic reading 4 Fragmented data bytes
try (InputStream inputStream = new FileInputStream(uploadVO.getFile())) {
for (int i = 0; i < sliceBytesVO.getFdTotalSlices(); i++) {
// Read the data bytes of each partition
this.readSliceBytes(i, inputStream, sliceBytesVO);
// Call fragment upload API Function of
String result = sliceApiCallFunction.apply(sliceBytesVO);
if (StringUtils.isEmpty(result)) {
continue;
}
return result;
}
} catch (IOException e) {
throw e;
}3、 ... and 、 summary
The so-called fragment upload , Is to upload the file , According to a certain size , Separate the entire file into blocks ( We call it Part) To upload separately .
Dealing with large files and slicing, the main core is to determine three points
- File fragmentation granularity
- How to read slices
- How to store slices
This article mainly analyzes and deals with how to compare the contents of large files in the process of uploading large files 、 Shard processing . Reasonably set the segmentation threshold and how to read and mark the segmentation . I hope it can help or inspire students engaged in related work . Later, we will discuss how to store the fragments 、 Mark 、 Combine documents for detailed interpretation .
边栏推荐
- Why is it invalid to assign values to offsetwidth and offsetHeight
- Where should DNS start? I -- from the failure of Facebook
- Nodejs operation state keeping technology cookies and sessions
- 游戏安全丨喊话CALL分析-写代码
- How to solve the problem that the ID is not displayed when easycvr edits the national standard channel?
- How to Net project migration to NET Core
- Spingboot reads the parameter values in the YML configuration file
- Can Tencent cloud disk service share data? What are the advantages of cloud disk service?
- Customize view to imitate today's headlines and like animation!
- 上线项目之局域网上线软件使用-----phpStudy
猜你喜欢

Four aspects of PMO Department value assessment

How PMO uses two dimensions for performance appraisal

Steps for formulating the project PMO strategic plan

JS advanced programming version 4: generator learning

Applet development framework recommendation

I am 30 years old, no longer young, and have nothing
Application of JDBC in performance test

3000 frame animation illustrating why MySQL needs binlog, redo log and undo log
随机推荐
JS regular ignore case
Setinterval stop
Global and Chinese market for hydropower plants 2022-2028: Research Report on technology, participants, trends, market size and share
JS how to get the current local date
What is the role of short video AI intelligent audit? Why do I need intelligent auditing?
Gin security -3: fast implementation of CSRF verification
JS five methods to judge whether a certain value exists in an array
What is the process of setting up local cloud on demand? Can cloud on demand audit videos?
Install bitgarden open source password manager
JS takes two decimal places
How to evaluate performance optimization? Covering too much knowledge?
Global and Chinese market of microphone racks 2022-2028: Research Report on technology, participants, trends, market size and share
What cloud disk types does Tencent cloud provide? What are the characteristics of cloud disk service?
Memory patch amsi bypass
How to solve the problem of large traffic audio audit? What are the common approval methods?
【Debian】Debian使用笔记
What is the role of computer auto audit audio? What content failed to pass the audit?
. NET Framework . Net core and Net standard
Is it safe for Huatai Securities to open an account online for securities companies with low handling fees and commissions
Implementation of flashback query for PostgreSQL database compatible with Oracle Database