当前位置:网站首页>PDF Text merge
PDF Text merge
2022-07-24 05:58:00 【Didi'cv】
PDF text merge
import os
from PyPDF2 import PdfFileReader, PdfFileWriter
# Use os Modular walk function , Search out all under the specified directory PDF file
# Get all in the same directory PDF The absolute path to the file
def getnumber(path):
try:
res = int(os.path.basename(path).split('.')[0])
except:
res = 10000000
return res
def getFileName(filedir):
file_list = [os.path.join(root, filespath) \
for root, dirs, files in os.walk(filedir) \
for filespath in files \
if str(filespath).endswith('pdf')
]
file_list = sorted(file_list,key=lambda x: getnumber(x))
return file_list if file_list else []
# Merge all under the same directory PDF file
def MergePDF(filepath, outfile):
output = PdfFileWriter()
outputPages = 0
pdf_fileName = getFileName(filepath)
if pdf_fileName:
for index, pdf_file in enumerate(pdf_fileName):
print(" route :%s"%pdf_file)
# Read source PDF file
input = PdfFileReader(open(pdf_file, "rb"))
# Get the source PDF The total number of pages in the file
# if index == 4: pageCount = 1
# else: pageCount = input.getNumPages()
pageCount = input.getNumPages()
outputPages += pageCount
print(" the number of pages :%d"%pageCount)
# Separately page Add to output output in
for iPage in range(pageCount):
output.addPage(input.getPage(iPage))
print(" Total combined pages :%d."%outputPages)
# Write to target PDF file
outPath = os.path.join(filepath, outfile)
if os.path.isfile(outPath) == True:
print(outPath, "PDF file already exist , Please delete and try again !")
return False
outputStream = open(outPath, "wb")
output.write(outputStream)
outputStream.close()
# print("PDF File merge complete !")
return True
else:
# print(" There is nothing to merge PDF file !")
return False
if __name__ == "__main__":
file_dir = input(' Please input Pdf Folder : ').replace('/','//')# Deposit PDF Original folder
outfile = "out.pdf" # Output PDF The name of the document
flag = MergePDF(file_dir, outfile)
if flag: print('PDF merger ')
else: print('PDF Merger failed , Please try again !')
Usage method

Number sorting , Generate the combined PDF file .
边栏推荐
- day5-jvm
- 第三章 线性模型总结
- [activiti] Introduction to activiti
- 在网络中添加SE通道注意力模块
- Machine learning (Zhou Zhihua) Chapter 4 notes on learning experience of decision tree
- Machine learning (zhouzhihua) Chapter 2 model selection and evaluation notes learning experience
- [activiti] activiti environment configuration
- 主成分分析计算步骤
- Chapter III summary of linear model
- JS star scoring effect
猜你喜欢

数组常用方法

Machine learning (Zhou Zhihua) Chapter 3 Notes on learning linear models

删除分类网络预训练权重的的head部分的权重以及修改权重名称

《机器学习》(周志华) 第3章 线性模型 学习心得 笔记

Chapter III summary of linear model

STM32标准外设库(标准库)官网下载方法,附带2021最新标准固件库下载链接

AD1256
![[activiti] Introduction to activiti](/img/99/e973279d661960853b3af69a7e8ef2.png)
[activiti] Introduction to activiti

单播、组播、广播、工具开发、QT Udp通讯协议开发简介及开发工具源码

谷歌/火狐浏览器管理后台新增账号时用户名密码自动填入的问题
随机推荐
Too many database connections
Common features of ES6
vscode 多行注释总是会自动展开的问题
顺序栈 C语言 进栈 出栈 遍历
AD1256
JVM系统学习
Canal+kafka actual combat (monitor MySQL binlog to realize data synchronization)
Numpy cheatsheet
Numpy array broadcast rule memory method array broadcast broadcast principle broadcast mechanism
OpenWRT快速配置Samba
IoTP2PGate 两台物联网设备点对点通信快速实现方案
学习率余弦退火衰减之后的loss
如何解决训练集和测试集的分布差距过大问题
Sqlserver completely deleted
Watermelon book / Pumpkin book -- Chapter 1 and 2 Summary
Commands for quickly opening management tools
[deep learning] handwritten neural network model preservation
《机器学习》(周志华)第2章 模型选择与评估 笔记 学习心得
day5-jvm
day-7 jvm完结