当前位置:网站首页>北大、加州伯克利大學等聯合| Domain-Adaptive Text Classification with Structured Knowledge from Unlabeled Data(基於未標記數據的結構化知識的領域自適應文本分類)
北大、加州伯克利大學等聯合| Domain-Adaptive Text Classification with Structured Knowledge from Unlabeled Data(基於未標記數據的結構化知識的領域自適應文本分類)
2022-06-23 21:52:00 【智源社區】
作者:Tian Li,Xiang Chen,Zhen Dong等
簡介:領域自適應文本分類是大規模預訓練的一個具有挑戰性的問題語言模型,因為它們通常需要昂貴的附加標記數據來適應新領域。現有作品通常無法利用跨域單詞之間的隱含關系。在本文中,作者提出了一種新方法,稱為結構化知識域適應 (DASK),通過利用詞級語義關系來增强域適應。DASK 首先構建一個知識圖譜來捕獲目標域中的主幹詞(與領域無關的詞)和非主幹詞之間的關系。然後在訓練期間,DASK 將與樞軸相關的知識圖譜信息注入到源域文本中。對於下遊任務,這些知識注入文本被輸入到能够處理知識注入文本數據的 BERT 變體中。感謝知識注入,作者的模型根據與樞軸的關系為非樞軸學習域不變特征。DASK 在使用偽標簽訓練期間通過候選樞軸的極性分數動態推斷,確保樞軸具有域不變的行為。作者在廣泛的跨域情感分類任務上驗證了 DASK,並觀察到 20 個不同域對的基線絕對性能提昇高達 2.9%。代碼將在 https://github.com/hikaru-nara/DASK 上提供。


論文下載:https://arxiv.org/pdf/2206.09591.pdf
边栏推荐
- 蓝牙芯片|瑞萨和TI推出新蓝牙芯片,试试伦茨科技ST17H65蓝牙BLE5.2芯片
- How to calculate individual income tax? You know what?
- Salesforce heroku (IV) application in salesforce (connectedapp)
- Prometheus primary body test
- How to use zero to build a computer room
- Facing the problem of lock waiting, how to realize the second level positioning and analysis of data warehouse
- Full text search of MySQL
- DM sub database and sub table DDL "pessimistic coordination" mode introduction - tidb tool sharing
- Chrome extension development Chinese tutorial-1
- Cloud database smooth disassembly scheme
猜你喜欢

Selenium批量查询运动员技术等级

蓝牙芯片|瑞萨和TI推出新蓝牙芯片,试试伦茨科技ST17H65蓝牙BLE5.2芯片

微信小程序中发送网络请求

发现一个大佬云集的宝藏硕博社群!

嵌入式开发:嵌入式基础——重启和重置的区别

How to use the serial port assistant in STC ISP?

Find my information | Apple may launch the second generation airtag. Try the Lenz technology find my solution

Polar cycle graph and polar fan graph of high order histogram

Selenium batch query athletes' technical grades

How to calculate individual income tax? You know what?
随机推荐
Shanghai benchmarking enterprise · Schneider Electric visited benchmarking learning lean production, smart logistics supply chain and digital transformation
DM sub database and sub table DDL "optimistic coordination" mode introduction - tidb tool sharing
Flink practical tutorial: advanced 4-window top n
Cloud database smooth disassembly scheme
How many of the five app automated test AIDS have you used?
How do I clean the ECS hard disk? Why do I clean the hard disk regularly?
Full text search of MySQL
Unusual transaction code mebv of SAP mm preliminary level
How to deal with high memory in API gateway how to maintain API gateway
How to download offline versions of Firefox and chrome
Xgboost implements text classification and sklearn NLP library tfidfvectorizer
The 11th Blue Bridge Cup
手机卡开户的流程是什么?在线开户安全么?
Error running PyUIC: Cannot start process, the working directory ‘-m PyQt5. uic. pyuic register. ui -o
The 10th Blue Bridge Cup single chip microcomputer
The most common usage scenarios for redis
Improve efficiency, take you to batch generate 100 ID photos with QR code
Chrome extension development Chinese tutorial-1
嵌入式开发:嵌入式基础——重启和重置的区别
Notepad++ installing the jsonview plug-in