当前位置:网站首页>2022 Tsinghua summer school notes L1_ NLP and bigmodel Foundation
2022 Tsinghua summer school notes L1_ NLP and bigmodel Foundation
2022-07-24 21:33:00 【The duck neck is gone】
2022 Tsinghua University large model cross Seminar
2022-6-27
L1
1 NLP part
- NLP Tasks
- Part of speech tagging
- Identification of named entities ( The phenomenon of omission )
- Community reduction ( pronouns )
- Note the dependencies between components
- Automatic Chinese word segmentation
Application :
- Use in search engines NLP
- matching query and document The similarity , Given query after , You can give some advertisements
- Judge the quality of documents
- NLP combination knowledge graph
- Fully extract 、 Using knowledge
- Human assistant
- translate ( Remove the language barrier )
- Use language as a perspective , To analyze society
- Use in search engines NLP
Word representation :
- Translate into the meaning of words that machines can understand
- similarity and relation
- disadvantages
- Manual marking , Some new meanings are missing
- Lack of subtle differences
- Subjectivity
- Data absorption
- It requires a lot of labor
- disadvantages
one-hot
- independent , Find a dimension corresponding to the word , The remaining dimensions are 0
- By default, words are orthogonal , The similarity between any two words is 0
- improvement : The meaning of a word is related to its context
- A word is represented by a common word in its context
- disadvantages :
- Increase the space of words
- For words that appear less frequently , The context is sparse , The result is not good
word embedding
- Establish a low dimensional vector space
- Word2Vec
Language Model
- Modeling language , Predict according to the above
- joint probability : The probability of having a sequence to form a sentence
- Conditional probability : Predict the next word according to the existing sentence
- How to complete ?
- hypothesis : A future word is only affected by the previous word
- It can be divided , Get the relationship between joint probability and conditional probability
- Modeling language , Predict according to the above
N-gram Model
- E.g, 4-gram(n-gram Only with the front of the word n-1 One word is about )
P ( w j ∣ never to late to ) = count ( too late to w j ) count ( too late to ) P\left(w_{j} \mid \text { never to late to }\right)=\frac{\operatorname{count}\left(\text { too late to } w_{j}\right)}{\operatorname{count}(\text { too late to })} P(wj∣ never to late to )=count( too late to )count( too late to wj)
(never Calculation is not included in this formula ) - disadvantages :
- The distance is short
- The similarity between words is still not considered
- E.g, 4-gram(n-gram Only with the front of the word n-1 One word is about )
Neural Language Model
- Distributed representation to construct
- Put the front first 3 Each word is expressed as a low dimensional vector , Then put the low dimensional vectors together , Form a higher vector , Then you can use this vector to predict the next word .
- All predictions are made through the representation of context .
- Learn parameter settings through large models
2 Bigmodel
2.1 brief introduction
The mechanism and details of the pre training language model .
- Why PLM?
- The effect of language understanding and language generation is very good
- Increasing parameters
- Add calculation
- Increase calculation power
- Example :GPT-3
- rich knowledge
- zero/few-shot( No label , Few samples )
- The effect of language understanding and language generation is very good
- Paradigms
- Learn from unmarked data , Do pre training through some self supervised tasks , Get rich general knowledge from it . In specific application , Then introduce task related knowledge , To adjust the model .
- word embddings
- contextual word embddings
- ELMo,ULMFiT
- Transformer
- Typical Case
- GPT
- Bert
2.2 Demo
- Big model demo
- GPT-3( Q & a model )
- Code big model
- DALL-E 2 Image generation
- Search engine
边栏推荐
- How about opening an account for CITIC Securities? Is it safe
- [verification of ID number]
- 驱动子系统开发
- Multiplication and addition of univariate polynomials
- Want to open an account and fry American crude oil, but always worry about insecurity?
- Can bank financial products be redeemed and transferred out on the same day?
- Spark related FAQ summary
- Big country "grain" policy | wheat expert Liu Luxiang: China's rations are absolutely safe, and the key to increasing grain potential lies in science and technology
- High soft course summary
- Selenium test page content download function
猜你喜欢

Alibaba cloud and parallel cloud launched the cloud XR platform to support the rapid landing of immersive experience applications

How to design the order system in e-commerce projects? (supreme Collection Edition)

Drive subsystem development

Gather relevant knowledge points and expand supplements

Together again Net program hangs dead, a real case analysis of using WinDbg

Shell introduction and variable definition

what? Does the multi merchant system not adapt to app? This is coming!

npm Warn config global `--global`, `--local` are deprecated. Use `--location=global` instead

Es+redis+mysql, the high availability architecture design is awesome! (supreme Collection Edition)

Codeforces Round #809 (Div. 2)(A~D2)
随机推荐
What are intelligent investment advisory products?
Classical review: understanding the "knowledge consistency" of neural networks (ICLR 2020)
A simple method -- determine whether the dictionary has changed
[install PG]
[jzof] 06 print linked list from end to end
Mysql database commands
Can century model simulate soil respiration? Practice technology application and case analysis of century model
Discussion on solving the application ecological problems of domestic systems based on small programs
From front-line development to technical director, you are almost on the shelf
93. Recursive implementation of combinatorial enumeration
How to choose securities companies that support flush? Is it safe to open an account on your mobile phone
Mysql database query is so slow. Besides index, what else can it do?
[blind box app mall system] function introduction after online unpacking
[jzof] 05 replace spaces
C local functions and yield statements
Summary of communication with customers
怎么在中金证券购买新课理财产品?收益百分之6
Press Ctrl to pop up a dialog box. How to close this dialog box?
Scientific computing toolkit SciPy image processing
How does redis realize inventory deduction and prevent oversold? (glory Collection Edition)