当前位置:网站首页>Using huggingface model to translate English
Using huggingface model to translate English
2022-07-24 11:54:00 【This Livermore is not too cold】
Baidu translated api There's a charge , We will use the open source model to translate English
from transformers import pipeline, AutoModelWithLMHead, AutoTokenizer
from tqdm import tqdm
import paramiko
from concurrent.futures import ThreadPoolExecutor
def get_en_to_zh_model():
model = AutoModelWithLMHead.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
translation = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer)
return translation
def en_to_ch(text):
# Translate English into Chinese
#text = "Student accommodation centres, resorts"
translated_text = translation(text, max_length=1024)[0]['translation_text']
return translated_text
def ch_to_en():
# Translate Chinese into English
model = AutoModelWithLMHead.from_pretrained("Helsinki-NLP/opus-mt-zh-en")
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-zh-en")
translation = pipeline("translation_zh_to_en", model=model, tokenizer=tokenizer)
text = " Student accommodation Center , A vacation home "
translated_text = translation(text, max_length=40)[0]['translation_text']
return translated_text
def get_translate_list_single(ori_txt):
""" Single thread """
with open(ori_txt,'r') as fp:
contents = fp.readlines()
translate_list = []
for sample in tqdm(contents):
print(sample)
translated_text = en_to_ch(sample)
print(translated_text)
translate_list.append("{}***{}\n".format(sample[:-2],translated_text))
with open('/cloud/cloud_disk/users/huh/nlp/base_catree_Text_Categorization/script/fu.txt','w') as fp:
fp.writelines(translate_list)
def translate_english_to_chinese(tmp_sentence):
""" Translate English into Chinese , Multithreading """
en_zh_list = []
translated_text = en_to_ch(tmp_sentence)
print(translated_text)
en_zh_list.append("{} *** {}\n".format(tmp_sentence[:-2], translated_text))
return en_zh_list
def get_translate_list_multi(ori_txt,end_txt):
""" Multithreading """
with open(ori_txt,'r') as fp:
contents = fp.readlines()
executor = ThreadPoolExecutor(max_workers=10)
en_zh_list = [executor.submit(translate_english_to_chinese, (tmp_sentence)) for tmp_sentence in contents]
end_list = []
for sample in en_zh_list:
end_list.append("{}\n".format(sample.result()[0]))
with open(end_txt, 'w') as f:
f.writelines(end_list)
if __name__ == '__main__':
ori_txt = '/cloud/cloud_disk/users/huh/nlp/base_catree_Text_Categorization/script/cope_dataset/translate_english_to_chinese/question.txt'
end_txt = '/cloud/cloud_disk/users/huh/nlp/base_catree_Text_Categorization/script/fu.txt'
translation = get_en_to_zh_model()
# Single thread
#get_translate_list_single(ori_txt)
get_translate_list_multi(ori_txt,end_txt)
边栏推荐
- 08.01 adjacency matrix
- Import the data in MariaDB into columnstore
- 哈希——18. 四数之和
- JVM visualvm: multi hop fault handling tool
- Optimization method of "great mathematics for use" -- optimal design of Cascade Reservoir Irrigation
- 哈希——349. 两个数组的交集
- Database operation through shell script
- 安装jmeter
- 源码分析Sentry用户行为记录实现过程
- 使用Prometheus+Grafana实时监控服务器性能
猜你喜欢
![Operational amplifier - Notes on rapid recovery [1] (parameters)](/img/1f/37c5548ce84b6a217b4742431f1cc4.png)
Operational amplifier - Notes on rapid recovery [1] (parameters)

Linked list - Sword finger offer interview question 02.07. linked list intersection

链表——剑指offer面试题 02.07. 链表相交

C language programming code

有关并行的两个重要定律

动态内存管理

Use prometheus+grafana to monitor server performance in real time

What is the charm of CSDN members? What's the use of him?
什么是云原生,云原生技术为什么这么火?

MySQL advanced (XVII) cannot connect to database server problem analysis
随机推荐
Semaphore详解
安装jmeter
字符串——剑指 Offer 05. 替换空格
Common formulas and application scenarios of discrete distribution
【网络空间安全数学基础第9章】有限域
Easy to use example
Literature record (part109) -- self representation based unsupervised exemplar selection in a union of subspaces
哈希——18. 四数之和
Mysql database
第1章 引言
哈希——202. 快乐数
String - 541. Reverse string II
Semaphore details
A* and JPS
Basic usage of GCC
Script redis write project notes
Recommended SSH cross platform terminal tool tabby
Paging query of employee information of black maredge takeout
三、MFC消息映射机制实现原理
Top and bottom of stack