java write file data escape problem

2023-02-28 18:40

Refer to GPT and your own ideas. When working with data that contains delimiters, consider using escape characters to distinguish between commas in data and commas of delimiters. In general, CSV files use double quotes to contain data that contains delimiters, and use commas within double quotes not as delimiters, such as:


In the above data, the comma delimiter is comma, while the comma in data "3,4" is enclosed in double quotes and is not treated as a delimiter.

In Java, you can use double quotes to contain data that contains commas and double quotes to escape the double quotes within it, such as:

String data = "1,2,\"3,4\",5";

If the data itself contains double quotes, you need to escape it to two double quotes, for example:

String data = "1,2,\"3,\"\"4\",5";

When writing to a CSV file, you can use commas as delimiters, double quotes to contain data that contains commas, and add a double quote before a double quote that needs to be escaped, for example:

String[] row = {"1", "2", "3,4", "5"};
String line = String.join(",", Arrays.stream(row).map(s -> "\"" + s.replace("\"", "\"\"") + "\"").toArray(String[]::new));

The code above concatenates the data in the array row into a string with commas, wraps the data containing the commas in double quotes, and adds a double quote before the double quote that needs to be escaped. The resulting string is:


If it helps you, please give it, thank you.

2023-02-28 18:40

Hello, you can do this:
1. Do not use a comma as the separator after writing to the file. 2. When writing to a file, place a single or double quotation mark around a field of character type:
For example, if you have a record:

 int id = 20;
         String message ="您好,今天还不错,是的吧";
         String name = "小刘";

Separate field values with commas when writing data to a file:

 int id = 20;
         String message ="您好,今天还不错,是的吧";
         String name = "小刘";
         message = "\""+message+"\"";

20, "Hello, not bad today, huh?", Xiao Liu

2023-02-28 18:40

Write the contents of the database to the file, and using delimiters is not the only way to split. Then there needs to be a rule that specifies which comma delimiter is valid. In this way you can achieve the results you want

2023-02-28 18:40

1- The delimiter modification method has been explained.
2- If you need to manipulate the data in this field frequently. You can split the fields into a table, depending on the business which is more convenient.
3- If the above method cannot be implemented. This requirement can only be made based on characteristics of the data, such as length. ps: If there is no feature, people can not distinguish, how to write code to distinguish

2023-02-28 18:40

the reference answer GPT ᴼ ᴾ ᴱ ᴺ ᴬ ᴵ < br / > when writing data to a file, if the data contains the delimiter, can consider to use some escape character or a comma in quotes to tag the data, To avoid confusion with the delimiter comma.

Here are some possible solutions:

  1. Use escape characters: You can add an escape character, such as a backslash "", before the comma in the data so that when reading the data it is recognized that the comma is a comma in the data and not a comma in the separator.
  2. Using quotes: You can place data containing commas in a pair of quotes, such as double quotes "" or single quotes "", so that when reading the data, you can also distinguish that the comma is a comma in the data, not a comma in the separator.

Note that if you use quotes to mark commas in data, you need to deal specifically with the inclusion of quotes in the data, either by using another quote as an escape character or by using some specific character inside the quote to represent the quote itself. For example, if you use double quotes to mark commas in data, the quotes that occur in the data need to be represented by double quotes, and the double quotes that occur inside the quotes need to be represented by two double quotes.

2023-02-28 18:40

Place double quotes around each data cell to distinguish data from delimiter commas. When the csv file is generated, double quotes are added to the contents, and
specifies the quotechar parameter as double quotes when reading the csv file. If you use pandas to read a csv file, note utf-8-sig.
Suppose the data looks like this:


import pandas as pd

# 生成csv文件
df = pd.DataFrame({"name":["张三","李四","王五"],"age":[18,20,22],"hobby":["看书,听音乐","打游戏","写代码"]})

# 读取csv文件
df = pd.read_csv("data.csv",quotechar='"',encoding='utf-8-sig')

  name  age    hobby
0   张三   18  看书,听音乐
1   李四   20     打游戏
2   王五   22     写代码
2023-02-28 18:40

Question, I will solve this problem for you, if it helps, also hope to adopt, click on the right side of the answer can be adopted.

You are advised to handle it in this way:

String[] strings = CsvUtil.toStringArray(csvStr);


2023-02-28 18:40

This answer quotes ChatGPT

When writing data from a database to a file, if the commas in the data are the same as the delimiter you have chosen, you need to escape the data appropriately or use a different delimiter to avoid confusion.

One solution is to use escape characters to escape commas in data. In CSV files, double quotes are usually used as escape characters. If the data contains commas, double quotation marks should be placed around the data. For example, if the data is "apple,orange,banana", it should be written as "apple,orange,banana", where "represents escaped double quotes. When reading a file, you can use the same method to parse the data and remove escape characters.

Another solution is to use a different separator, such as a TAB or vertical bar. Doing so avoids confusion with commas in the data. When reading a file, you can specify the correct delimiter to parse the data.

2023-02-28 18:40

if the data in the database is inherently a comma, that you can't change the symbol as a separator, such as | | like < br / > in addition, \ t, /, & amp; Either t, or the Chinese character →, can be used as the separator with
otherwise you must escape the original comma in the database

2023-02-28 18:40

Then use two commas as the separator.

