0
Follow
0
View

The file generated by java is compressed and decompressed

dsjack1412 注册会员
2023-02-28 02:42

The gzip command is called to compress the linux, and the gunzip command is used to extract the

.

dl04061227 注册会员
2023-02-28 02:42

your delimiter | with commas or Spaces, and then change the code to the corresponding format try

hnlsb2013 注册会员
2023-02-28 02:42

The specific reasons can not be seen only from the discourse described by the subject, which can be analyzed from several aspects. Firstly, the java produced file is opened with a tool, for example, the generated file is txt, which can be opened with ultedit. Or open notepad, see if there is a problem said by the subject, if not check the code generation file step. Then | symbols replaced with other symbols, such as a comma in English, before compression using tools to see whether there will be a problem whether screening is because symbol segmentation problem, and then try to use gzip compression, unpack again, look at the problem of the unzip before and after contrast, troubleshoot problems step by step.

cwpwds 注册会员
2023-02-28 02:42
< div class = "md_content_show e397 data - v - 3967" = "" >

if this is the case It should be morally all | into x to ah This is individual case he

cjc3751 注册会员
2023-02-28 02:42

The following answers are based on ChatGPT and GISer Liu:

Probably because you didn't take character encoding into account when writing data, causing problems when compressing and uncompressing.

① It is recommended that you convert data to a byte array and specify a character encoding when writing data, such as

.
String data = "0|1123|123123|45612";
byte[] bytes = data.getBytes("UTF-8");
② Then write the byte array to the file:
FileChannel channel = new FileOutputStream("path", true).getChannel();
channel.write(ByteBuffer.wrap(bytes));
channel.write(ByteBuffer.wrap(System.lineSeparator().getBytes("UTF-8")));

③ In the decompression, also need to specify the corresponding character encoding:

GZIPInputStream gzipInputStream = new GZIPInputStream(new FileInputStream("path.gz"));
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gzipInputStream, "UTF-8"));

If you have processed the data as described above and the problem persists, you may have compressed and uncompressed it incorrectly or there is another problem. You can provide more information, such as the code that was compressed and uncompressed, and the specific data that occurred, so that we can further help you resolve the problem.

durkwin 注册会员
2023-02-28 02:42

If possible, it is recommended to change the separator to English comma instead of |

cupidcool 注册会员
2023-02-28 02:42

this problem may be due to the data itself exists x characters, and you are using data separators | and x characters with the same ASCII, resulting in file cannot distinguish right.
or see if the raw data has Spaces or something. Or something like \n

gyh33900 注册会员
2023-02-28 02:42

This may be because the encoding format used during compression or decompression is inconsistent. In Java, strings are encoded in UTF-16 by default, but different encoding formats may be used for file reading and writing or compression, such as UTF-8 or GBK.

when you converts a string to a byte, use the utf-8 encoding format, so in written to the file, the separator "|" is transformed into utf-8 bytes. But, when the decompression, may use different coding format(for example, GBK), lead to "|" character is incorrectly decoded into other characters, such as "x".

To solve this problem, you need to always use the same encoding format for reading and writing files, compressing and decompressing files, and so on. UTF-8 is recommended as the encoding format because it supports most characters and is widely supported in different operating systems. At the same time, when decompressing, you also need to specify the encoding format used by the compressed file to ensure that each character is decoded correctly.

dongmaolin 注册会员
2023-02-28 02:42





1. Check if there is a similar symbol x, or look for an alternative character ~, and see if there is still a problem.
Compression, here's what you need to do:
1.

ahsatx 注册会员
2023-02-28 02:42

Refer to GPT and your own ideas, this situation may be due to coding problems. In Java, when text is written to a file, the system's default character encoding is used by default, and the same encoding may not be used to extract the file. Therefore, when you use gzip to compress and decompress text files, you should explicitly specify the character encoding to ensure that the same encoding is used when writing and reading the file.

in addition, if your file contains special characters, such as "|", these characters may conflict with your delimiter, lead to the data format error. If you want to keep these special characters, you can escape them or otherwise handle them when writing to the file.

Finally, you can diagnose the problem by printing the contents of the file before and after compression, and by using the correct encoding when decompressing.
If it is helpful to you, please accept it, thank you.