本文的目的是介绍Word2007多篇论文汇编方法与技巧的详细情况,特别关注多篇word文档如何汇编的相关信息。我们将通过专业的研究、有关数据的分析等多种方式,为您呈现一个全面的了解Word2007多篇
本文的目的是介绍Word2007多篇论文汇编方法与技巧的详细情况,特别关注多篇word文档如何汇编的相关信息。我们将通过专业的研究、有关数据的分析等多种方式,为您呈现一个全面的了解Word2007多篇论文汇编方法与技巧的机会,同时也不会遗漏关于Java 解析 word2007 和 Excel2007、PHPWord解决中文乱码并导出生成Word2007(docx)格式文档、POI 将Excle2003,Excle2007,word2003,word2007转换为html、poi操作word模板(word2003,word2007)的知识。
本文目录一览:- Word2007多篇论文汇编方法与技巧(多篇word文档如何汇编)
- Java 解析 word2007 和 Excel2007
- PHPWord解决中文乱码并导出生成Word2007(docx)格式文档
- POI 将Excle2003,Excle2007,word2003,word2007转换为html
- poi操作word模板(word2003,word2007)
Word2007多篇论文汇编方法与技巧(多篇word文档如何汇编)
学校每年都要把所有老师写的集中整理成年度论文汇编,但汇编论文的工作可不轻松,要把上百个论文文档汇编成一个文档,还要对汇编后的内容进行逐篇分页、编辑目录、页码。若按一般做法那可需要花费相当多的时间精力。其实在Word2007中我们只要适当应用一些常用功能,完全可以快速轻松地完成汇编论文这项麻烦工作。
轻松合并文档
在Word 2007中要合并上百个论文文档还是比较简单的。先把交上来的所有文档集中存放在同一文件夹内,比如;d:\论文集”。打开Word2007,新建一个文档,切换到;插入”选项卡,单击;文字”区域的;对象”,从下拉列表中选择;文件中的文字”。在;插入文档”窗口中打开论文所在文件夹;d:\论文集”,按Ctrl+A键选中所有论文文档,再单击;插入”按钮,马上可以看到所有论文已经合并到当前文档中了。
快速逐篇分页
合并后所有论文是首尾相连的,而在汇编中通常要求每一篇的标题都要从独立的页面开始,也就是说需要在每一篇的标题前分页。百来篇论文若逐一按;Ctrl+回车键”分页可得不少时间。好在学校对有统一要求,所有上交论文的标题都是二号字、加粗,根据这一特点我们可以用Word的查找替换快速分页。
在Word 2007,单击;开始”选项卡;编辑”区域的;替换”,在;查找和替换”窗口中单击;更多”按钮以显示高级选项。然后单击;格式”按钮选择;字体(F)…”,在;查找字体”窗格中选择字号为;二号”、字形为;粗体”,确定后在;查找和替换”窗口的查找内容下会显示;格式: 字体:二号,加粗”。查找内容不用输入,只在;替换为”中输入^m^&(图1),单击;全部替换”按钮即可在所有标题前逐一插入分页符完成自动按篇分页。对于标题分两行的论文, Word只会在第一行标题前分页。
注:^m代表人工分页符,^&代表查找的内容。此外,在重启Word前设置的查找格式不会自动消除。因此查找格式后再找其他内容,记得先定位到查找内容的输入框中单击;不限制格式”按钮取消查找格式,否则你可能会找不到所需内容。
汇编目录速成
如果直接输入汇编论文目录显然会很麻烦,以后编辑页码时修改也很麻烦。我们可以先对标题统一设置标题样式,再让Word根据标题样式自动生成目录,这样就简单多了。
单击;开始”选项卡;编辑”区域的;查找”,单击;更多”按钮,然后单击;格式”按钮选择;字体(F)…”设置查找字体为;二号”、;粗体”。查找内容不用输入,单击;在以下项中查找”按钮,选择;主文档”,文档中所有论文标题会全部被选中。现在单击;开始”选项卡;样式”列表中的;标题”,即可把所有论文标题设置为;标题”样式。
现在定位到文档开头,切换到;引用”选项卡,单击;目录”选择;插入目录”。在;目录”窗口中设置;显示级别”为1,再单击;选项”按钮,在;目录选项”窗口的样式列表中删除;标题1”后的目录级别数字1,只保留;标题”(图2),一路确定返回,即可在文档开头自动插入目录。然后切换到;页面布局”选项卡,单击;分隔符”选择分节符的;下一页”,把目录和论文分开。当然你可能还需要在目录上输入目录两个大字。以后若重新排版造成论文标题、页码变化时,只要右击目录选择;刷新域”即可自动修正目录。
设置论文页码
设置文档页码比较简单,切换到;插入”选项卡,单击的;页码”选择;页面底端”的;普通数字2”即可为文档插入页码。接下来还得双击把输入点定位到第一篇论文前,再次单击;页码”选择;设置页码格式”,在弹出窗口中设置;起始页码”为1,确定让论文部分的页码从1开始。最后右击目录选择;刷新域”以显示新页码即可。至此我们的论文汇编工作就大功告成了。
Java 解析 word2007 和 Excel2007

package com.test;
/**
* 需要的jar包:
* poi-3.0.2-FINAL-20080204.jar
* poi-contrib-3.0.2-FINAL-20080204.jar
* poi-scratchpad-3.0.2-FINAL-20080204.jar
* poi-3.5-beta6-20090622.jar
* geronimo-stax-api_1.0_spec-1.0.jar
* ooxml-schemas-1.0.jar
* openxml4j-bin-beta.jar
* poi-ooxml-3.5-beta6-20090622.jar
* xmlbeans-2.3.0.jar
* dom4j-1.6.1.jar
*/
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import org.apache.poi.POIXMLDocument;
import org.apache.poi.POIXMLTextExtractor;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.openxml4j.exceptions.OpenXML4JException;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import org.apache.xmlbeans.XmlException;
public class WordAndExcelExtractor {
public static void main(String[] args){
try{
String wordFile = "D:/松山血战.docx";
String wordText2007 = WordAndExcelExtractor.extractTextFromDOC2007(wordFile);
System.out.println("wordText2007======="+wordText2007);
InputStream is = new FileInputStream("D:/XXX研发中心技术岗位职位需求.xls");
String excelText = WordAndExcelExtractor.extractTextFromXLS(is);
System.out.println("text2003==========" + excelText);
String excelFile = "D:/Hello2007.xlsx";
String excelText2007 = WordAndExcelExtractor.extractTextFromXLS2007(excelFile);
System.out.println("excelText2007==========" + excelText2007);
}catch(Exception e ){
e.printStackTrace();
}
}
/**
* @Method: extractTextFromDOCX
* @Description: 从word 2003文档中提取纯文本
*
* @param
* @return String
* @throws
*/
public static String extractTextFromDOC(InputStream is) throws IOException {
WordExtractor ex = new WordExtractor(is); //is是WORD文件的InputStream
return ex.getText();
}
/**
* @Method: extractTextFromDOCX
* @Description: 从word 2007文档中提取纯文本
*
* @param
* @return String
* @throws
*/
public static String extractTextFromDOC2007(String fileName) throws IOException, OpenXML4JException, XmlException {
OPCPackage opcPackage = POIXMLDocument.openPackage(fileName);
POIXMLTextExtractor ex = new XWPFWordExtractor(opcPackage);
return ex.getText();
}
/**
* @Method: extractTextFromXLS
* @Description: 从excel 2003文档中提取纯文本
*
* @param
* @return String
* @throws
*/
@SuppressWarnings("deprecation")
private static String extractTextFromXLS(InputStream is)
throws IOException {
StringBuffer content = new StringBuffer();
HSSFWorkbook workbook = new HSSFWorkbook(is); //创建对Excel工作簿文件的引用
for (int numSheets = 0; numSheets < workbook.getNumberOfSheets(); numSheets++) {
if (null != workbook.getSheetAt(numSheets)) {
HSSFSheet aSheet = workbook.getSheetAt(numSheets); //获得一个sheet
for (int rowNumOfSheet = 0; rowNumOfSheet <= aSheet.getLastRowNum(); rowNumOfSheet++) {
if (null != aSheet.getRow(rowNumOfSheet)) {
HSSFRow aRow = aSheet.getRow(rowNumOfSheet); //获得一行
for (short cellNumOfRow = 0; cellNumOfRow <= aRow.getLastCellNum(); cellNumOfRow++) {
if (null != aRow.getCell(cellNumOfRow)) {
HSSFCell aCell = aRow.getCell(cellNumOfRow); //获得列值
if(aCell.getCellType() == HSSFCell.CELL_TYPE_NUMERIC){
content.append(aCell.getNumericCellValue());
}else if(aCell.getCellType() == HSSFCell.CELL_TYPE_BOOLEAN){
content.append(aCell.getBooleanCellValue());
}else {
content.append(aCell.getStringCellValue());
}
}
}
}
}
}
}
return content.toString();
}
/**
* @Method: extractTextFromXLS2007
* @Description: 从excel 2007文档中提取纯文本
*
* @param
* @return String
* @throws
*/
private static String extractTextFromXLS2007(String fileName) throws Exception{
StringBuffer content = new StringBuffer();
//构造 XSSFWorkbook 对象,strPath 传入文件路径
XSSFWorkbook xwb = new XSSFWorkbook(fileName);
//循环工作表Sheet
for(int numSheet = 0; numSheet < xwb.getNumberOfSheets(); numSheet++){
XSSFSheet xSheet = xwb.getSheetAt(numSheet);
if(xSheet == null){
continue;
}
//循环行Row
for(int rowNum = 0; rowNum <= xSheet.getLastRowNum(); rowNum++){
XSSFRow xRow = xSheet.getRow(rowNum);
if(xRow == null){
continue;
}
//循环列Cell
for(int cellNum = 0; cellNum <= xRow.getLastCellNum(); cellNum++){
XSSFCell xCell = xRow.getCell(cellNum);
if(xCell == null){
continue;
}
if(xCell.getCellType() == XSSFCell.CELL_TYPE_BOOLEAN){
content.append(xCell.getBooleanCellValue());
}else if(xCell.getCellType() == XSSFCell.CELL_TYPE_NUMERIC){
content.append(xCell.getNumericCellValue());
}else{
content.append(xCell.getStringCellValue());
}
}
}
}
return content.toString();
}
}
转载自:
http://archive.cnblogs.com/a/1759383/PHPWord解决中文乱码并导出生成Word2007(docx)格式文档
最近一个项目开发要用到PHP技术导出Word文档,比较了几种方案,首先是使用Microsoft Office自 [] Related posts: 主机域名www的自适应301重定向方法 ASP/VBScript动态创建属性对象的工厂类(DynamicObject) ASP/VBScript中CHR(0)的由来以及带来的安全问题 原
最近一个项目开发要用到PHP技术导出Word文档,比较了几种方案,首先是使用Microsoft Office自 […]
Related posts:
- 主机域名www的自适应301重定向方法
- ASP/VBScript动态创建属性对象的工厂类(DynamicObject)
- ASP/VBScript中CHR(0)的由来以及带来的安全问题
原文地址:PHPWord解决中文乱码并导出生成Word2007(docx)格式文档, 感谢原作者分享。
POI 将Excle2003,Excle2007,word2003,word2007转换为html
上一篇是写了关于解析ppt,这一篇是关于Excle,Word的,其实用poi解析excle是非常好用的,参考了网上大神的东西,自己添加修改了些东西,都是写代码的苦命兄弟,拿出来共同参考下,有意见大家指正。遇到的问题是,如果用json将这些html代码返回的页面是不行的,因为json不支持html格式输出,折衷的办法是通过encodeURI编码,然后DecodeURI解码,但是全篇解码会有问题,有些字符如“=”,“;”等无法解析完全。所以不太建议用json,如果非用不可,最好手动解码(一听就知道是个很痛苦的事情),但还是会让html有瑕疵
package com.ysy.officeRead.controller;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStreamWriter;
import java.io.StringWriter;
import java.util.List;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.apache.commons.io.FileUtils;
import org.apache.commons.lang.StringEscapeUtils;
import org.apache.poi.hssf.converter.ExcelToHtmlConverter;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.converter.PicturesManager;
import org.apache.poi.hwpf.converter.WordToHtmlConverter;
import org.apache.poi.hwpf.usermodel.Picture;
import org.apache.poi.hwpf.usermodel.PictureType;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xwpf.converter.core.BasicURIResolver;
import org.apache.poi.xwpf.converter.core.FileImageExtractor;
import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;
import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.w3c.dom.Document;
public class OfficeBeRead {
/**
*url:标示上传文件在服务器本地的全路径,用来创建图片储存文件夹,使用uuID作为文件夹名称,挺恶心的事情
*projectPath: 文件在服务器上的路径
*/
public String poiWord2003ToHtml(String url, String projectPath) {
String pathString = url.substring(0, url.lastIndexOf("."));
String proString2 = projectPath.substring(0, projectPath.lastIndexOf("."))+"/";
String file = "1.doc";
String content = "";
//创建文件夹
try {
InputStream inputStream = new FileInputStream(url);
HWPFDocument worDocument = new HWPFDocument(inputStream);
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(DocumentBuilderFactory
.newInstance().newDocumentBuilder().newDocument());
wordToHtmlConverter.setPicturesManager(new PicturesManager() {
public String savePicture(byte[] content, PictureType pictureType, String suggestedName,
float widthInches, float heightInches) {
// TODO Auto-generated method stub
return suggestedName;
}
});
wordToHtmlConverter.processDocument(worDocument);
List pics = worDocument.getPicturesTable().getAllPictures();
if(pics!=null){
for (int i = 0; i < pics.size(); i++) {
Picture picture = (Picture) pics.get(i);
File file2 = new File(pathString,picture.suggestFullFileName());
if(!file2.exists()&&!file2.isDirectory()){
file2.getParentFile().mkdirs();
file2.createNewFile();
}
picture.writeImageContent(new FileOutputStream(pathString+"/"+picture.suggestFullFileName()));
}
}
Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(outputStream);
TransformerFactory tfFactory = TransformerFactory.newInstance();
Transformer serializer = tfFactory.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
outputStream.close();
content = new String(outputStream.toByteArray());
//图片路径替换
FileUtils.write(new File(pathString, "1.html"), content, "utf-8");
content = replaceAllStr(content, proString2);
} catch (Exception e) {
// TODO: handle exception
e.printStackTrace();
}
return content;
}
/**
* url为文件上传后所在路径
* projectPath 为文件所在项目下的访问路径
*/
public String poiWord2007ToHtml(String url,String projectPath){
String sourceFileNameString = url; //目标文件路径
String imagePathString = url.substring(0, url.lastIndexOf("."));
String targetFileNameString = imagePathString+"1.html";
String proString2 = projectPath.substring(0, projectPath.lastIndexOf("."))+"/";
String out = "";
FileOutputStream outputStream = null;
OutputStreamWriter outputStreamWriter = null;
try {
XWPFDocument document = new XWPFDocument(new FileInputStream(sourceFileNameString));
XHTMLOptions options = XHTMLOptions.create();
//存放图片的文件夹
options.setExtractor(new FileImageExtractor(new File(imagePathString)));
//html中图片的路径
options.URIResolver(new BasicURIResolver("/"));
File file2 = new File(targetFileNameString);
if(!file2.exists()&&!file2.isDirectory()){
file2.getParentFile().mkdirs();
file2.createNewFile();
}
outputStream = new FileOutputStream(targetFileNameString);
outputStreamWriter = new OutputStreamWriter(outputStream);
XHTMLConverter xhtmlConverter = (XHTMLConverter) XHTMLConverter.getInstance();
xhtmlConverter.convert(document, outputStreamWriter, options);
FileInputStream file = new FileInputStream( new File(targetFileNameString));
// size 为字串的长度 ,这里一次性读完
int size=file.available();
byte[] buffer=new byte[size];
file.read(buffer);
file.close();
out=new String(buffer);
//这是用来解决生成的汉字是Uncio十进制码的
out = StringEscapeUtils.unescapeHtml(out);
System.out.println(out);
out = replaceAllStr(out, proString2);
} catch (Exception e) {
// TODO: handle exception
e.printStackTrace();
}finally{
if(outputStream != null){
try {
outputStream.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
if(outputStreamWriter != null){
try {
outputStreamWriter.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
return out;
}
/* public static void main(String[] args) {
System.out.println(new OfficeBeRead().poiWord2003ToHtml());
}*/
/**
*替换字符串中指定字符位置添加指定的字符串
*
*在此用来替换图片源路径
*
*/
public String replaceAllStr(String content,String imgurl){
String[] ss = content.split("<img src="+"\"");
String sssString = "";
if (ss.length>1) {
for (int i = 0; i < ss.length-1; i++) {
sssString = sssString+ss[i]+"<img src="+"\""+imgurl;
}
sssString = sssString + ss[ss.length-1];
}
return sssString;
}
/**
*poi将Excel转换为html
*该方法无法解析图片
*
*/
public String PoiExcel2003ToHtml(String url,String projectPath){
File excelFile = new File(url);
InputStream iStream = null;
FileOutputStream outputStream = null;
StringWriter writer = null;
String imagePathString = url.substring(0, url.lastIndexOf("."));
String htmlFile = imagePathString+"1.html";
File htmlfile2 = new File(htmlFile);
File filep = new File(htmlfile2.getParent());
String content = "";
try {
if(excelFile.exists()){
if(!filep.exists()){
filep.mkdirs();
}
iStream = new FileInputStream(excelFile); //初始化文件
HSSFWorkbook workbook = new HSSFWorkbook(iStream);
ExcelToHtmlConverter converter = new ExcelToHtmlConverter(DocumentBuilderFactory
.newInstance().newDocumentBuilder().newDocument());
converter.processWorkbook(workbook);
writer = new StringWriter();
Transformer serializer = TransformerFactory.newInstance().newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(
new DOMSource(converter.getDocument()),
new StreamResult(writer));
outputStream = new FileOutputStream(htmlFile);
outputStream.write(writer.toString().getBytes("UTF-8"));
FileInputStream fis = new FileInputStream(htmlfile2); //获取html文件输入流
int size = fis.available();
byte[] buffer=new byte[size];
fis.read(buffer);
fis.close();
content = new String(buffer);
System.out.println(content);
outputStream.flush();
outputStream.close();
writer.close();
}
} catch (Exception e) {
// TODO: handle exception
e.printStackTrace();
} finally{
if(iStream != null){
try {
iStream.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
if(outputStream != null){
try {
outputStream.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
if(writer!=null){
try {
writer.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
return content;
}
/**
* POI 解析Excel2007版,生成HTML
* @param fileName 文件(含地址)
* @return 解析出来的HTML页面String
*/
public String PoiExcel2007ToHtml(String url,String projectPath){
StringBuffer content = new StringBuffer();
XSSFWorkbook xwb = null;
try{
// 构造 XSSFWorkbook 对象,strPath 传入文件路径
xwb = new XSSFWorkbook(url);
content.append("<html><head><meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\"><title>Parse Excel With POI</title></head><body>");
// 循环工作表Sheet
for (int numSheet = 0; numSheet < xwb.getNumberOfSheets(); numSheet++) {
XSSFSheet xSheet = xwb.getSheetAt(numSheet);
if (xSheet == null) {
continue;
}
content.append("<h3 valign=''middle'' align=''center''>"+xSheet.getSheetName()+"</h3>");
content.append("<table valign=''middle'' align=''center'' border=1 cellspacing=0 cellpadding=1>");
// 循环行Row
for (int rowNum = 0; rowNum <= xSheet.getLastRowNum(); rowNum++) {
XSSFRow xRow = xSheet.getRow(rowNum);
if (xRow == null) {
continue;
}
content.append("<tr align=''middle''>");
// 循环列Cell
for (int cellNum = 0; cellNum <= xRow.getLastCellNum(); cellNum++) {
XSSFCell xCell = xRow.getCell(cellNum);
if (xCell == null || "".equals(xCell)) {
content.append("<td>").append(" ").append("</td>");
}else if (xCell.getCellType() == XSSFCell.CELL_TYPE_BOOLEAN) {
content.append("<td>").append(" ").append(xCell.getBooleanCellValue()).append("</td>");
} else if (xCell.getCellType() == XSSFCell.CELL_TYPE_NUMERIC) {
content.append("<td>").append(" ").append(this.doubleToString(xCell.getNumericCellValue())).append("</td>");
} else{
content.append("<td>").append(" ").append(xCell.getStringCellValue()).append("</td>");
}
}
content.append("</tr>");
}
content.append("</table>");
}
content.append("</body></html>");
}catch(Exception e){
e.printStackTrace();
System.out.println("POI解析Excel2007错误");
}
return content.toString();
}
/**
* change double variable into string type
* @param d
* @return
*/
public String doubleToString(double d){
String str = Double.valueOf(d).toString();
String temp = str;
String result = "";
if(str.indexOf("E")>2)
result = str.substring(0,1) + temp.substring(2, str.indexOf("E"));
else{
if(str.indexOf(".0")>0)
result = str.substring(0,str.indexOf(".0")) ;
else
result = str;
}
return result;
}
}
poi操作word模板(word2003,word2007)
近期老师给了个任务,要通过Word模版生成各类文档,主要就是将类似%title%,%name%,%content%等标签,通过类的方法,查询数据库并替换标签,上网查了一下,发现POI对文档操作比较不错,Apache POI是Apache软件基金会的开放源码函式库,POI提供API给Java程序对Microsoft Office格式档案读和写的功能。Word2003的方法比较简单,大致通过Range替换文本,Word2007比较复杂点,遍历替换文本,参考了网上的案例,写了个demo,直接上代码。
package com.poi.util;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.poi.POIXMLDocument;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
import org.apache.poi.xwpf.usermodel.XWPFTable;
import org.apache.poi.xwpf.usermodel.XWPFTableCell;
import org.apache.poi.xwpf.usermodel.XWPFTableRow;
import com.poi.model.Person;
public class WordUtil {
private Map<String, String> map = new HashMap<String, String>();//存放标签与替换的值
private String templatePath;//模版路径
public WordUtil(String templatePath,Person person) {
this.templatePath = templatePath;
if (templatePath.endsWith("docx")) {
try {
@SuppressWarnings("resource")
XWPFWordExtractor docx = new XWPFWordExtractor(
POIXMLDocument.openPackage(templatePath));
String docxText = docx.getText();
getValueMap(docxText,person);
} catch (Exception e) {
e.printStackTrace();
}
} else {
try {
@SuppressWarnings("resource")
WordExtractor doc = new WordExtractor(new FileInputStream(
templatePath));
String docText = doc.getText();
getValueMap(docText,person);
} catch (Exception e) {
e.printStackTrace();
}
}
}
//遍历获取标签与通过对象替代的值
private void getValueMap(String text,Person person) {
Pattern pattern = Pattern.compile("%(.*?)%");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
String key = matcher.group(1);
String value = person.getString(key);
if (value == null) {
value = "";
}
map.put(key, value);
}
}
public void createDoc(String newPath) {
if (templatePath.endsWith("docx")) {
replaceDoc2007(newPath);
} else {
replaceDoc2003(newPath);
}
}
public void replaceDoc2007(String newPath) {
try {
OPCPackage pack = POIXMLDocument.openPackage(templatePath);
XWPFDocument doc = new XWPFDocument(pack);
// 处理段落
List<XWPFParagraph> paragraphList = doc.getParagraphs();
processParagraphs(paragraphList, map);
// 处理表格
Iterator<XWPFTable> it = doc.getTablesIterator();
while (it.hasNext()) {
XWPFTable table = it.next();
List<XWPFTableRow> rows = table.getRows();
for (XWPFTableRow row : rows) {
List<XWPFTableCell> cells = row.getTableCells();
for (XWPFTableCell cell : cells) {
List<XWPFParagraph> paragraphListTable = cell
.getParagraphs();
processParagraphs(paragraphListTable, map);
}
}
}
FileOutputStream fos = new FileOutputStream(newPath);
doc.write(fos);
fos.flush();
fos.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public void replaceDoc2003(String newPath) {
String[] str = newPath.split(".doc");
newPath = str[0] + ".doc";
try {
FileInputStream fis = new FileInputStream(new File(templatePath));
HWPFDocument doc = new HWPFDocument(fis);
Range bodyRange = doc.getRange();
for (Map.Entry<String, String> entry : map.entrySet()) {
bodyRange.replaceText("%" + entry.getKey() + "%",
entry.getValue());
}
// 输出word文件
ByteArrayOutputStream ostream = new ByteArrayOutputStream();
doc.write(ostream);
OutputStream outs = new FileOutputStream(newPath);
outs.write(ostream.toByteArray());
outs.close();
} catch (Exception e) {
e.printStackTrace();
}
}
private void processParagraphs(List<XWPFParagraph> paragraphList,
Map<String, String> map) {
for (XWPFParagraph paragraph : paragraphList) {
List<XWPFRun> runs = paragraph.getRuns();
for (XWPFRun run : runs) {
String text = run.getText(0);
boolean isSetText = false;
for (Map.Entry<String, String> entry : map.entrySet()) {
String key = entry.getKey();
if (text.indexOf(key) != -1) {
isSetText = true;
text = text.replace("%" + entry.getKey() + "%",entry.getValue());
}
}
if (isSetText) {
run.setText(text, 0);
}
}
}
}
}
今天关于Word2007多篇论文汇编方法与技巧和多篇word文档如何汇编的讲解已经结束,谢谢您的阅读,如果想了解更多关于Java 解析 word2007 和 Excel2007、PHPWord解决中文乱码并导出生成Word2007(docx)格式文档、POI 将Excle2003,Excle2007,word2003,word2007转换为html、poi操作word模板(word2003,word2007)的相关知识,请在本站搜索。
本文标签: