
本文详解如何使用 apache poi 读取 excel(.xlsx)文件,将含逗号分隔值的单元格(如“class§ion”“subject”)智能拆分为结构化 json 对象数组,并生成符合预期的嵌套 jsonarray。
本文详解如何使用 apache poi 读取 excel(.xlsx)文件,将含逗号分隔值的单元格(如“class§ion”“subject”)智能拆分为结构化 json 对象数组,并生成符合预期的嵌套 jsonarray。
在实际教育系统或教务数据导入场景中,Excel 常以紧凑格式存储一对多关系:例如单个教师对应多个班级-年级组合(如 "6-A,7-B,8-A")或多个学科(如 "Tamil,English,Maths")。直接按列遍历会丢失语义结构,需结合单元格内容特征进行条件化解析与嵌套建模。以下提供一套健壮、可扩展的 Java 实现方案,基于 Apache POI 5.x + FastJSON(或兼容 org.json),无需引入额外 ORM 框架,全程面向对象思维组织逻辑。
✅ 核心设计思路
- 表头驱动映射:动态读取首行作为 JSON 字段名(如 "Teacher_code", "class&Section", "subject_name"),提升代码通用性;
- 类型安全解析:区分 CELL_TYPE_STRING 与 CELL_TYPE_NUMERIC,对数字型教师编码(如 23424234)避免科学计数法(如 2.3424234E7);
-
语义化拆分策略:
- 若字段含 "-"(如 "6-A")→ 视为 class-section 组合,拆为 { "class": "6", "section": "A" };
- 若无 "-"(如 "Tamil")→ 视为独立 subject,生成 { "subject": "Tamil" };
- 嵌套 JSONArray 构建:每个非空单元格解析后封装为 JSONObject 列表,再注入顶层 JSONObject 对应键下。
? 完整实现代码(Apache POI + FastJSON)
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.text.NumberFormat;
import java.util.*;
import java.util.stream.Collectors;
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONObject;
import com.alibaba.fastjson.JSONArray;
public class ExcelToNestedJson {
public static void main(String[] args) {
String excelPath = "C:/Users/HP/Downloads/school.xlsx";
try (FileInputStream in = new FileInputStream(excelPath);
Workbook workbook = new XSSFWorkbook(in)) {
Sheet sheet = workbook.getSheetAt(0);
if (sheet == null) throw new IllegalArgumentException("Sheet is empty");
// Step 1: Read header row
Row headerRow = sheet.getRow(sheet.getFirstRowNum());
List<String> headers = new ArrayList<>();
for (int j = headerRow.getFirstCellNum(); j < headerRow.getLastCellNum(); j++) {
Cell cell = headerRow.getCell(j);
headers.add(cell != null && cell.getCellType() == CellType.STRING
? cell.getStringCellValue().trim()
: "col_" + j);
}
// Step 2: Process data rows (skip header)
List<JSONObject> result = new ArrayList<>();
for (int i = sheet.getFirstRowNum() + 1; i <= sheet.getLastRowNum(); i++) {
Row row = sheet.getRow(i);
if (row == null) continue;
JSONObject teacherObj = new JSONObject();
for (int j = row.getFirstCellNum(); j < row.getLastCellNum(); j++) {
Cell cell = row.getCell(j);
if (cell == null) continue;
String header = headers.get(j);
JSONObject valueObj = new JSONObject();
switch (cell.getCellType()) {
case STRING:
String rawStr = cell.getStringCellValue().trim();
if (rawStr.isEmpty()) break;
// Split by comma and process each token
String[] tokens = rawStr.split("\s*,\s*");
JSONArray array = new JSONArray();
for (String token : tokens) {
token = token.trim();
if (token.isEmpty()) continue;
JSONObject item = new JSONObject();
if (token.contains("-")) {
// e.g., "6-A" → { "class": "6", "section": "A" }
String[] parts = token.split("-", 2);
item.put("class", parts[0].trim());
item.put("section", parts[1].trim());
} else {
// e.g., "Tamil" → { "subject": "Tamil" }
item.put("subject", token);
}
array.add(item);
}
teacherObj.put(header, array);
break;
case NUMERIC:
// Format number to avoid scientific notation
NumberFormat nf = NumberFormat.getInstance(Locale.ENGLISH);
nf.setGroupingUsed(false);
String numStr = nf.format(cell.getNumericCellValue()).replaceAll("[^\d.]", "");
// Assume first column is Teacher_code → store as string
if (j == row.getFirstCellNum()) {
teacherObj.put(header, numStr);
} else {
// For numeric columns other than first, treat as single class (rare, but safe fallback)
valueObj.put("class", numStr);
teacherObj.put(header, valueObj);
}
break;
default:
// Skip blank or unsupported types
break;
}
}
result.add(teacherObj);
}
// Output formatted JSON array
System.out.println(JSON.toJSONString(result, true));
} catch (IOException e) {
e.printStackTrace();
}
}
}⚠️ 关键注意事项
-
依赖配置(Maven):
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>5.2.4</version> </dependency> <dependency> <groupId>com.alibaba</groupId> <artifactId>fastjson</artifactId> <version>2.0.49</version> <!-- 推荐 v2.x,线程安全且性能更优 --> </dependency> - 空单元格鲁棒性:代码中显式检查 cell == null 和 rawStr.isEmpty(),防止 NullPointerException;
- 分隔符容错:正则 \s*,\s* 自动忽略逗号前后的空格(如 "6-A, 7-B ,8-A");
-
字段名映射灵活性:若 Excel 列名与目标 JSON 键不一致(如 "class&Section" → "class"),可在 headers 处预处理:
headers.replaceAll(s -> s.replace("class&Section", "class")); - 性能提示:对于超大 Excel(>10w 行),建议启用 SXSSFWorkbook 流式读取,并用 JSONArray 直接写入输出流,避免全量内存驻留。
✅ 验证输出示例
输入 Excel 第二行(数据行): | Teacher_code | class&Section | subject_name | |--------------|----------------|--------------------| | 23424234 | 6-A,7-B,8-A | Tamil,English,Maths |
输出 JSON:
[
{
"Teacher_code": "23424234",
"class&Section": [
{"class": "6", "section": "A"},
{"class": "7", "section": "B"},
{"class": "8", "section": "A"}
],
"subject_name": [
{"subject": "Tamil"},
{"subject": "English"},
{"subject": "Maths"}
]
}
]该方案兼顾可读性、健壮性与业务适配性,可快速集成至教务系统数据迁移、API 批量导入等生产场景。如需支持更多嵌套层级(如 subject → chapter)或导出为 Spring Boot REST 响应体,仅需扩展解析逻辑即可。










