前言:
最近有需求处理docx文件,并讲内容显示到页面,对world进行在线的阅读,这样我这里就使用flask+Document对docx文件进行处理并显示,下面直接上代码:
Document处理:
首先下载Document的库文件,先直接安装最新版的python-docx,如果不行则换成1.1.0版本:
pip install python-docxpip install python-docx==1.1.0
处理docx代码如下:
def ReadVADocx(ProjectName,DocxName):docxfilepath = vaReportDir + "\\" + ProjectName + "\\" + DocxNameparagraphs = ReadDocx(docxfilepath)return paragraphsdef ReadDocx(docxfilepath):doc = Document(docxfilepath)paragraphs = list()pattern = re.compile('rId\d+')for graph in doc.paragraphs:level = graph.style.name.split(' ')[-1]if level == "Normal":level = Noneelif level == "Preformatted":level = Noneparagraph = {'text': graph.text,'level': level,'images': ""}paragraphs.append(paragraph)for run in graph.runs:if run.text == '':contentID = pattern.search(run.element.xml)if contentID:contentID = contentID.group(0)try:contentType = doc.part.related_parts[contentID].content_typeexcept KeyError as e:print(e)continueif not contentType.startswith('image'):continueimgData = doc.part.related_parts[contentID].blobimage_base64 = base64.b64encode(imgData).decode('utf-8')paragraph = {'text':run.text,'level': run.style.name.split(' ')[-1] if run.style.name.startswith('Heading') else None,'images': image_base64}paragraphs.append(paragraph)
上述代码会对docx文件进行遍历,并将对应的内容和等级放入数组中
下面是调用代码:
@app.route('/ViewVADocx', methods=['GET'])def ViewVADocx(): try:DocxName = request.args.get('docx')ProjectName = request.args.get('name')paragraphs = engine.ReadVADocx(ProjectName,DocxName)return render_template("viewdocx.html", n_getname=ProjectName, n_user=user,paragraphs=paragraphs) except Exception as e: return render_template('error-500.html')
html编写:
然后就是需要讲对应的内容在页面进行展示,下面列出html代码:
{% extends "mould.html" %}{% block head %}{% endblock %}{% block body %}↑回到顶部↑ {{ n_getname }}:扫描节点线
快速导航:
{% for paragraph in paragraphs %}{% if paragraph.level == "1"%}{ loop.index0 }}" class="hover-link">{{ paragraph.text }}{% elifparagraph.level == "2" %}
{ loop.index0 }}" class="hover-link2">{{ paragraph.text }}
{% endif %}{% endfor %} {% for paragraph in paragraphs %}{% if paragraph.level%}{% if paragraph.level == "Title" %}<!--{{ paragraph.text }}
-->{% elifparagraph.level == "1" %}<h{{ paragraph.level }}>{{ paragraph.text }}</h{{ paragraph.level }}>{% else %}<h{{ paragraph.level }}>{{ paragraph.text }}</h{{ paragraph.level }}>{% endif %}{% else %}{% if paragraph.images %}<img class="aligncenter" src="https://img.maxssl.com/uploads/?url=data:image/png;base64,{{ paragraph.images }}" />
{% else %}{{ paragraph.text }}
{% endif %}{% endif %}{% endfor %} {% endblock %}{% block list %} .hover-link {font-size: 20px;}.hover-link:hover {color: red;font-size: 30px;}.hover-link2 {font-size: 15px;}.hover-link2:hover {color: red;font-size: 20px;}/* CSS 样式,用于定义悬浮框的外观 */.floating-box {position: fixed;bottom: 20px;right: 20px;width: 80px;height: 50px;background-color: #ff9900;color: #fff;text-align: center;line-height: 50px;cursor: pointer;}// JavaScript 代码var floatingBox = document.getElementById('floatingBox');// 点击事件监听器floatingBox.addEventListener('click', function() {window.scrollTo({ top: 0, behavior: 'smooth' });});{% endblock %}
其中添加了样式和回到顶部等小功能,方便浏览,最后的使用效果如下:
后记:
代码只做了docx文件的内容展示,包括文字和图片,并对等级进行了划分,没有对docx的修改功能,感兴趣的可以自己研究下