当前位置:网站首页>Efficient office of fintech (I): automatic generation of trust plan specification
Efficient office of fintech (I): automatic generation of trust plan specification
2022-06-23 06:05:00 【cjh_ hit】
Efficient office of fintech : Automatically generate the trust plan specification
background
Computers have greatly improved people's work efficiency , But in addition to using mature software on the market , The financial industry also has to meet the actual business needs , Write your own gadgets to improve office efficiency .
The internship company gave me a task yesterday afternoon , Said he was in a hurry : According to two word The mapping relationship between file paragraphs automatically generates the trust plan specification . In particular , One document is the due diligence report , It contains relevant information about business participants , The information is filled in according to the specific template . Another document is the plan statement , There are also specific templates .
( So there is not much content in this project , Just use Python Will a word Copy the specified paragraph in the file to another word Specify the location in the file .)
Due diligence report :
Instructions :
The mapping table :
demand
In the past, employees copied it manually 、 Paste , But the company has a large volume , Deal with a lot of contracts every day , Therefore, it is necessary to write a program to automatically generate plan specifications according to the mapping relationship to improve office efficiency .
To write
Finally, by checking the data , In the end use Python-docx The library has developed a program that can A The content between two paragraphs in a document ( Include Text and tables ) Copied to the B Procedure after the specified paragraph of the document .
Because the first contact Python-docx, Not very familiar with the principles and details of many interfaces .Python-docx The principle seems to be that Python-docx The structure of is transformed into xml. The data was used last year Java Handle xml The program , But I have forgotten for a long time …
The source code is given directly below :
Because for the first time Python-docx, If there is any non-standard or non Introduction , Please forgive me and point out .
Of course , Before use in adopt pip install Python-docx.
from docx import Document
from docx.text.paragraph import Paragraph
from docx.oxml.text.paragraph import CT_P
from docx.oxml.table import CT_Tbl
from docx.table import Table
from copy import deepcopy
import pandas as pd
def copyText(filename,paratext,Para):
document = Document(filename)
paras=document.paragraphs
index=0
if type(paratext)==str:
print('copy:',paratext,Para.text)
for para in paras:
if para.text==paratext:
index=paras.index(para)+1
para=paras[index]
else:
print('copy:',Para.text,paratext.text)
for para in paras:
if para.text== paratext.text:
index = paras.index(para) + 1
paratext.runs[0].drawing_lst:
para = paras[index]
newPara=para.insert_paragraph_before()
for run in Para.runs:
# Copy content ( Include styles )
newParaRun=newPara.add_run(run.text)
newParaRun.bold = run.bold
newParaRun.italic = run.italic
newParaRun.underline = run.underline
newParaRun.font.color.rgb = run.font.color.rgb
newParaRun.style.name = run.style.name
newPara.paragraph_format.alignment = Para.paragraph_format.alignment
newPara.paragraph_format.first_line_indent = Para.paragraph_format.first_line_indent
newPara.paragraph_format.left_indent = Para.paragraph_format.left_indent
newPara.paragraph_format.right_indent = Para.paragraph_format.right_indent
document.save(filename)
def copyTable(filename,paratext,table):
# Copy the form
document = Document(filename)
paras = document.paragraphs
if type(paratext)==str:
for para in paras:
#print(para.text)
if paratext == para.text :
paragraph=para
tbl, p = table._tbl, paragraph._p
else:
for para in paras:
# print(para.text)
if paratext.text == para.text:
paragraph = para
tbl, p = table._tbl, paragraph._p
new_tbl = deepcopy(tbl)
p.addnext(new_tbl)
document.save(filename)
def Copy_Contents_Between_ParaA_ParaB_to_ParaC(filename1, filename2,Paratext1,Paratext2,Paratext3):
documentA = Document(filename1)
paragraphs = documentA.paragraphs# All paragraphs
Paratext1 = Paratext1.encode('utf-8').decode('utf-8')
for aPara in paragraphs:
if Paratext1 == aPara.text :# Match to the beginning paragraph
ele = aPara._p.getnext()
break
while(True):# Traverse backward
if ele==None:
break
if ele.tag == '':
break
if isinstance(ele, CT_P):# It's a paragraph
para = Paragraph(ele, documentA)
if Paratext2 == para.text:
break
copyText(filename2, Paratext3, para)# Copy the form
if para.text!='':
Paratext3=para
elif isinstance(ele, CT_Tbl):# It's a form
table=Table(ele,documentA)
copyTable(filename2,Paratext3,table)# Copy the form
ele=ele.getnext()
if __name__ == '__main__':
data = pd.read_excel(' To tune out - Plan specification mapping table .xlsx')
for i in range(len(data[' Plan statement ( To generate table )'])):
Copy_Contents_Between_ParaA_ParaB_to_ParaC(' Data sources - Due diligence report .docx',' The resulting trust plan statement .docx',data[' Due diligence report - Start paragraph '][i],
data[' Due diligence report - End paragraph '][i],data[' Plan statement ( To generate table )'][i])
Plan specification generated after program execution :
You can see , The data specified in the due diligence report has been copied to the specified location in the plan specification .
Problems to be solved :
The above program can copy text, tables and styles , But you can't copy pictures . according to the understanding of ,Python-docx There is no interface to extract the picture at the specified location ( At least not in the official manual ), So secondary development is needed , It is necessary to study Python-docx And some xml The knowledge of the . But because time is limited ( Online classes still need to be watched , Homework still needs to be written ), I will leave this question to my internship director .
If the big guys know how to solve the problem of image processing , Please grant me your advice .
边栏推荐
- WordPress contact form entries cross cross site scripting attack
- The official artifact of station B has cracked itself!
- Cloud native database is the future
- Adnroid activity screenshot save display to album view display picture animation disappear
- Pat class B 1015 C language
- Visual Studio调试技巧
- Pat class B 1023 minimum decimals
- [database backup] complete the backup of MySQL database through scheduled tasks
- Layer 2技术方案进展情况
- Alibaba cloud ack one and ACK cloud native AI suite have been newly released to meet the needs of the end of the computing era
猜你喜欢

jvm-01.指令重排

Cloud native database is the future

Adnroid activity截屏 保存显示到相册 View显示图片 动画消失

Huawei's software and hardware ecosystem has taken shape, fundamentally changing the leading position of the United States in the software and hardware system
![[cocos2d-x] screenshot sharing function](/img/fc/e3d7e5ba164638e2c48bc4a52a7f13.png)
[cocos2d-x] screenshot sharing function

金融科技之高效办公(一):自动生成信托计划说明书

jvm-03.jvm内存模型

十一、纺织面料下架功能的实现

What benefits have digital collections enabled the real industry to release?

Data migration from dolphin scheduler 1.2.1 to dolphin scheduler 2.0.5 and data test records after migration
随机推荐
使用链表实现两个多项式相加和相乘
Redis 哨兵
android Handler内存泄露 kotlin内存泄露处理
【数据库备份】通过定时任务完成MySQL数据库的备份
Summary of ant usage (I): using ant to automatically package apk
Pat class B 1014 C language
十一、纺织面料下架功能的实现
Explicability of counter attack based on optimal transmission theory
Android handler memory leak kotlin memory leak handling
Perfect squares for leetcode topic analysis
Ansible uses ordinary users to manage the controlled end
App SHA1 acquisition program Baidu map Gaode map simple program for acquiring SHA1 value
云原生数据库是未来
Adnroid activity screenshot save display to album view display picture animation disappear
PAT 乙等 1021 个位数统计
Pat class B 1019 C language
MDM data cleaning function development description
jvm-03.jvm内存模型
ORB_ Slam2 operation
Palindrome number for leetcode topic analysis