当前位置:网站首页>\W and [a-za-z0-9_], \Are D and [0-9] equivalent?
\W and [a-za-z0-9_], \Are D and [0-9] equivalent?
2022-06-27 21:59:00 【JAPAN_ is_ shit】
When I first looked at regular expressions, I had this doubt , Why does Baidu Encyclopedia say so ?
You have to understand unicode Character set , Of course, you can also figure out the character set of Chinese characters, English and numbers ?Unicode Encyclopedia of characters
Chinese characters are in unicode In the table is 4e00-9fa5
english , Numbers and special symbols belong to unicode Latin in
therefore \w Just like [A-Za-z0-9_] It's much more extensive , For example, it can match the words of other countries , and \d Can match the numbers of other countries .
More than \w and \d The scope of is , In regular metacharacters . \W,\D,\s,\S,\b,\B It can also match other words , So how can it not match all Unicode The characters ?
adopt re.ASCII To set only match ASCII character
import re
# Expand Arabia - Indic digit
s="۱۲۳۴۵۶۷۸۹"
print(s.isdigit())
a= re.match(r'\d+', s)
print(a.group())
#True
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿ'
b= re.match(r'\w+', d) # Match alphanumeric underscores
print(b.group())
#۱۲۳۴۵۶۷۸۹
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿ'
b= re.match(r'\D+', d) # Match a non number
print(b.group())
#ᠠᠡᠢᠣᠤᠶᠿ
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿ'
b= re.match(r'\S+', d) # Match a visible character
print(b.group())
#ᠠᠡᠢᠣᠤᠶᠿ
s="۱۲۳۴۵۶۷۸۹"
print(s.isdigit())
a= re.match(r'.+', s)
print(a.group())
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿᠢᠣᠤ'
b= re.findall(r'\bᠠᠡ', d) # Matches a word boundary
print(b)
# ['ᠠᠡ']
after re.ASCII Set up , \w You can't match anything by matching Mongolian
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿᠢᠣᠤ'
b= re.findall(r'\wᠠᠡ', d,re.ASCII)# Matches a word boundary , Limit to ASCII in
print(b)
# [] It doesn't match anything
边栏推荐
- 【MySQL】数据库函数通关教程下篇(窗口函数专题)
- Quick excel export
- VMware virtual machine PE startup
- Software test automation test -- interface test from entry to proficiency, learn a little every day
- [LeetCode]508. 出现次数最多的子树元素和
- The difference between scrum and Kanban
- "Apprendre cette image" apparaît sur le Bureau win11 comment supprimer
- [LeetCode]100. Same tree
- [LeetCode]515. 在每个树行中找最大值
- 鲜为人知的mysql导入数据
猜你喜欢

What is the core competitiveness of front-line R & D personnel aged 35~40 in this position?

How to delete "know this picture" on win11 desktop

Go从入门到实战——Panic和recover(笔记)

熊市慢慢,Bit.Store提供稳定Staking产品助你穿越牛熊

Go from introduction to actual combat - package (notes)

清华大学教授:软件测试已经走入一个误区——“非代码不可”

Go from introduction to practice - error mechanism (note)

List of language weaknesses --cwe, a website worth learning

Process control task

Go from entry to practice - multiple selection and timeout control (notes)
随机推荐
Xiao Wang's interview training task
[leetcode] dynamic programming solution partition array ii[arctic fox]
Analysis of stone merging
[LeetCode]508. The most frequent subtree elements and
有时间看看ognl表达式
Open source technology exchange - Introduction to Chengying, a one-stop fully automated operation and maintenance manager
[LeetCode]动态规划解分割数组I[Red Fox]
Test automatique de Test logiciel - test d'interface de l'introduction à la maîtrise, apprendre un peu chaque jour
Go从入门到实战——多态(笔记)
豆沙绿保护你的双眼
[leetcode] 508. Élément de sous - arbre le plus fréquent et
鲜为人知的mysql导入数据
[LeetCode]30. 串联所有单词的子串
美团20k软件测试工程师的经验分享
使用Fiddler模拟弱网测试(2G/3G)
Go从入门到实战——任务的取消(笔记)
Go from introduction to actual combat - panic and recover (notes)
单元测试界的高富帅,Pytest框架,手把手教学,以后测试报告就这么做~
MYSQL和MongoDB的分析
[LeetCode]508. 出現次數最多的子樹元素和