当前位置:网站首页>Perform Jieba word segmentation on the required content and output EXCEL documents according to word frequency
Perform Jieba word segmentation on the required content and output EXCEL documents according to word frequency
2022-07-25 22:21:00 【Buddhist monk】
Read in excel data structure :
import pandas as pd
import jieba
df = pd.read_excel('xuqiufenxi.xls')
print(df)
# Create a new column to store word segmentation results
df['fenci'] = ''
# Traverse the text of each line , And save the word segmentation results into the new column
for i in range(len(df)):
print(i)
df['fenci'][i] = ' '.join(jieba.cut(df[' Content of requirements '][i]))
print(df['fenci'][i])
# Count the number of times each word appears
word_count = {
}
for word in df['fenci'][i].split():
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
# take word_count The dictionary is converted into dataframe
word_count_df = pd.DataFrame(word_count.items(), columns=['word', 'count'])
# according to count Value descending sort
word_count_df = word_count_df.sort_values(by='count', ascending=False)
# Output excel
word_count_df.to_excel(f"{
df[' function '][i]}.xlsx", index=False)
Output :
边栏推荐
- The automation testing post spent 20K recruiting, but in the end, there was no suitable one. Both fresh students are better than them
- Usage of in in SQL DQL query
- 【数据库学习】Redis 解析器&&单线程&&模型
- El expression improves JSP
- The testing work is not valued. Have you changed your position?
- 开户就可以购买收益在百分之六以上的理财产品了吗
- Get together for ten years, tell your story, millions of gifts are waiting for you
- Application of breakthrough thinking in testing work
- win10搭建flutter环境踩坑日记
- C language: random generated number + bubble sort
猜你喜欢

Application of breakthrough thinking in testing work

H5 lucky scratch lottery free official account + direct operation

3dslicer import cone beam CT image

Square root of X

The second short contact of gamecloud 1608

Xiaobai programmer day 8

Playwright tutorial (II) suitable for Xiaobai

Xiaobai programmer's fourth day

Use of hyperlinks

Xiaobai programmer's first day
随机推荐
Synchronized and volatile
Use of hyperlinks
Xiaobai programmer's fourth day
Mitsubishi FX PLC free port RS command realizes Modbus Communication
Sofa weekly | open source person - Niu Xuewei, QA this week, contributor this week
Why does redisv6.0 introduce multithreading?
Xiaobai programmer's seventh day
6-17 vulnerability exploitation - deserialization remote command execution vulnerability
What is the difference between minor GC and full GC?
启牛商学院和微淼商学院哪个靠谱?老师推荐的开户安全吗?
Redis foundation 2 (notes)
Solutions to the failure of win key in ikbc keyboard
MySQL --- 子查询 - 列子查询(多行子查询)
Having met a tester with three years' experience in Tencent, I saw the real test ceiling
ThreadLocal summary (to be continued)
数据库进阶·如何针对所有用户数据中没有的数据去加入随机的数据-蜻蜓Q系统用户没有头像如何加入头像数据-优雅草科技kir
什么是分区分桶?
聚名十年,说出你的故事,百万豪礼等你拿
Fill the whole square with the float property
ML-Numpy