当前位置:网站首页>Excel data extraction technique: a universal formula for extracting numbers from mixed text
Excel data extraction technique: a universal formula for extracting numbers from mixed text
2022-06-24 06:10:00 【User 8639654】
In the last article , Floret explains by looking at mixed text features , Set specific formulas , Three scenarios for data extraction . therefore , Some little petals whispered to little flower : Miss Xiaohua , I am stupid. , No data characteristics can be seen , I'm lazy , I don't want to set different formulas for different scenarios , Is there a kind of overlord universal formula , What kind of mixed text we can hard bow ?
The answer, of course, is , yes , we have ! however , It is still necessary to distinguish between the two situations . One is to extract values , There are positive and negative differences in size , There is also a decimal point ; The other is to extract numeric strings , Such as telephone number 、 ID number, etc , The numbers here have no decimals or minus signs , There is no difference in size .
How to write the universal formula of these two scenarios respectively , How to understand ? And listen to the little flower .
Four 、 A universal formula for extracting numerical values
Situational features : Except for the target value , There are no other numbers in the text , Otherwise, it is easy to cause interference .
Universal formula :
{=-LOOKUP(9^9,-MIDB(A2,MIN(FINDB(LEFT(ROW($1:$11)-2,1),A2&-1/19)),ROW($1:$100)))}
The formula is disassembled in detail as follows :
①LEFT(ROW(1:11)-2,1)
ROW(1:11) Well understood. , Back to page 1 Go to the first place 11 The line number of the line , That is to say 11 Made up of... Characters aggregate A{1,2,3…11},-2 It becomes Character set B{-1,0,1,2…9}. Re pass LEFT Extract character set B The first character on the left , Generate Character set C{"-",0,1,2,…9}, That is, symbols and 0-9 These ten characters , All values , By this 11 Characters make up .
Sum up , The function of this part is to construct all characters of Arabic numerals , These numbers help us to lock the position , And then extract the Arabic values .
②FINDB(①,A2&-1/19)
FINDB Is to find the position of the character in the target text , It is associated with FIND The difference is , It returns the byte sequence number , That is to say, Chinese characters and symbols are regarded as 2 Bytes . Thus we can see that ,A2 Cells in mixed text , Minus sign “-” The place where it appears is 5, instead of 3.
The formula uses A2&-1/19 To make sure that Character set C{"-",0,1,2,…9} Every character of is in FIND Appears in the find text for , Make sure FIND There is no error value in the return value of . fragment ② return Character set C{"-",0,1,2,…9} stay A2&-1/19 Position of appearance , namely Ordinal set D{5,13,10,6,…}.
③MIN(②)
MIN(②) take ② Result Ordinal set D{5,13,10,6,…} Minimum of , It is the target value at A2 Starting position in , namely A2 Mixed text , The position where the negative sign or Arabic numeral first appears , That is, the starting position of the target extraction value . This is why the left side of the target number is required , There can be no irrelevant Arabic numerals or negative signs .
④-MIDB(A2,③,ROW($1:$100))
Use here MIDB, instead of MID, It's for correspondence FINDB, Part of the text is intercepted by byte position .ROW($1:$100) Returns an ordered array {1-100}, As MIDB The third argument to the function —— Number of bytes to extract , I.e. separate extraction 1-100 Characters . Learn more skills , Please collect and pay attention to Tribal education excel Text course .
therefore ,MIDB The function of the function is from ③ Start at the determined starting position , Respectively from the A2 The cut length in the cell text is 1-100 Bytes of 100 individual Unequal length string E{"-","-2","-29","-299",…"-299.19"}. and -MIDB Is to subtract unequal length strings , This causes non numeric data to report an error as #VALUE!, And then Unequal length string E Convert to pure numbers and error values #VALUE! A new constant composed of Array F{#VALUE!;2;29;299;299;299.1;299.19;…;299.19}
⑤-LOOKUP(9^9,④)
LOOKUP Queries have three features :
1. The default query area is in ascending order , That is, the later the value is, the greater .
2. The return value should be less than and closest to the query value .
3. Ignore the wrong values in the query area .
thus , We assign a maximum number to the query value 9^9, because LOOKUP Characteristics of 1, So the last non error value of the query area is the maximum value , That is, the value is the return value .LOOKUP These characteristics of , It perfectly ignores the error value and takes the last valid value !
5、 ... and 、 Universal formula for extracting characters
usage : Extract all the values of the target cell in turn and merge .
Universal formula :
{=SUM(MID(0&A2,LARGE(ISNUMBER(--MID(A2,ROW($1:$100),1))*ROW($1:$100),ROW($1:$100))+1,1)*10^ROW($1:$100)/10)}
The formula is briefly disassembled as follows :
① ISNUMBER(--MID(A2,ROW($1:$100),1))*ROW($1:$100)
adopt MID(A2,ROW($1:$100),1) Extract each character one by one , Use double minus sign operation , Distinguish between numbers and other characters , Reuse ISNUMBER Function to determine whether each character is a number , Returns a set of logical values , Last *ROW($1:$100) Make the number return to its in A2 Position in mixed text , Other characters return 0.
② LARGE(①,ROW($1:$100))
adopt LARGE function , take ① Reorder the set of character position values in from large to small . Because the position of the number in the text is always greater than 0, And the lower the number , The higher the position value is . Other characters are always less than 0 Of . The point here is to put all 0 After setting the value , At the same time, all digital position values are inverted .
③ MID(0&A2,②+1,1)
MID according to ② Position value of +1 from 0&A2 One by one . Because the non numeric position value is 0, All non numeric return values take the first place 0, The remaining figures are unaffected . because ② The numeric position value of is reversed , therefore , At this time, the extracted numbers are reversed .
④ SUM(③*10^ROW($1:$100)/10))
The first three steps lead to A2 All the numbers in the cell and a string representing non numeric positions 0 An ordered array of , This completes the final extraction , You also need to arrange the numbers in positive order 、 Remove 0 Values and merge them . These are all handed over to *10^ROW($1:$100)/10 complete , It builds a multi digit number to put the numbers in order , Will eventually represent the number of significant digits before the text 0 Value ellipsis , The rest of the numbers are arranged from one bit to the left in order . The final multi digit number is the result of digital extraction .
Actually , The problem of extracting numeric strings ,19 Years later, the version has a very simple and brain - free solution –– adopt CONCAT Just connect directly .
19 The universal formula is as follows :
{=CONCAT(IFERROR(--MID($A2,ROW($1:$100),1),""))}
边栏推荐
- How to resolve computer domain name resolution errors how to select a good domain name
- New tea: reshuffle, transformation, merger and acquisition
- Go concurrency - work pool mode
- Semantic web, semantic web, linked data and knowledge map
- How to solve the problem that easynvr calls the video download interface of the specified time period to display "being synthesized" and does not generate video?
- MySQL forgets root password cracking root password cracking all user passwords, shell script
- How to solve the enterprise network security problem in the mixed and multi cloud era?
- Idea2020 latest activation tutorial, continuously updated
- You don't have to spend a penny to build a wechat official website in a minute
- How to do reverse domain name resolution? What does reverse domain name resolution mean?
猜你喜欢
随机推荐
Playing "honey in snow and ice city" with single chip microcomputer
Summary of basic notes of C language (I)
How to apply for a company domain name? What are the requirements for the applicant company?
Experience sharing on unified management and construction of virtual machine
Oceanus practice - develop MySQL CDC to es SQL jobs from 0 to 1
My two-year persistence is worth it!
How to record the purchased domain name? Why should the purchased domain name be filed?
Enterprise management background user manual
Easycvr development environment startup program reports an error import cycle not allowed solution
Koa middleware implementation
Neighbor vote: use proximity voting to optimize monocular 3D target detection (ACM mm2021)
How to use ffmpeg one frame H264 to decode yuv420p in audio and video development?
What are the domain name registration query tools? What should be paid attention to when registering a domain name
Understand the classification and summary of cross chain related technologies
The company is worried about customer frustration and brand damage caused by DDoS Attacks
Micro build low code supports Excel to import data source
Risc-v instruction set explanation (4) R-type integer register register instruction
Overview of related concepts of social network analysis
What is the difference between a white box test and a black box test
How do individuals register domain names? What are the precautions for individual domain name registration?



