当前位置:网站首页>XML usage and parsing of data storage and transmission files

XML usage and parsing of data storage and transmission files

2022-06-25 16:05:00 Hua Weiyun

Concept :Extensible Markup Language Extensible markup language

Scalable : The tags are all custom .

  • function
    Store the data
    • The configuration file
    • To transmit in a network
  • xml And html The difference between
    1. xml The tags are all custom ,html Tags are predefined .
    2. xml The grammar is strict ,html Loose grammar
    3. xml It's about storing data ,html It's showing data

grammar :

Basic grammar :

  1. xml The suffix of the document .xml
  2. xml The first line must be defined as a document declaration
  3. xml There is and only one root tag in the document
  4. Attribute values must use quotation marks ( Single and double ) Lead up
  5. Label must be closed correctly
  6. xml Label names are case sensitive

Quick start :

	<?xml version='1.0' ?>	<users>		<user id='1'>			<name> Have a drink together </name>			<age>23</age>			<gender>superman</gender>			<br/>		</user>				<user id='2'>			<name>zjq</name>			<age>18</age>			<gender>man</gender>		</user>	</users>

Part of the :

The document statement

  1. Format :
  2. Property list :
    version: Version number , Required properties
    encoding: Encoding mode . Tells the parsing engine what character set the current document uses , The default value is :ISO-8859-1
    standalone: Is it independent
    Value :
    yes: Don't rely on other files
    no: Rely on other files

Instructions : combination css Of

<?xml-stylesheet type="text/css" href="a.css" ?>

label : Label name custom

The rules :
Names can contain letters 、 Numbers and other characters
Names cannot begin with numbers or punctuation
The name cannot be in letters xml( perhaps XML、Xml wait ) Start
The name cannot contain spaces

attribute

id Attribute value is unique

Text

CDATA District : The data in this area will be displayed as is
Format : <![CDATA[ data ]]>

constraint : Regulations xml Rules for writing documents

As users of the framework ( The programmer ):

  1. In the xml Constraint document is introduced in
  2. Be able to read and understand constraint documents easily

classification :

  1. DTD: A simple constraint technique
  2. Schema: A complex constraint technique

DTD

introduce dtd Document to xml In the document

  • Inside dtd: Define the constraint rules in xml In the document
  • external dtd: Define the rules of constraint in the external dtd In file
    Local :<!DOCTYPE Root sign SYSTEM "dtd The location of the file ">
    The Internet :<!DOCTYPE Root sign PUBLIC "dtd File name " "dtd The location of the file URL">

Schema

introduce :

  1. Fill in xml Root element of the document
  2. introduce xsi Prefix .  xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance
  3. introduce xsd File namespace .  xsi:schemaLocation=“http://www.zjq.com/xml  student.xsd”
  4. For every one xsd Constraints declare a prefix , As identification  xmlns=“http://www.zjq.com/xml

Case study :
<students xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.zjq.com/xml"
xsi:schemaLocation=“http://www.zjq.com/xml student.xsd”>

analysis : operation xml file , Read the data in the document into memory

operation xml file

  1. analysis ( Read ): Read the data in the document into memory
  2. write in : Save the data in memory to xml In the document . Persistent storage

analysis xml The way

  1. DOM: Load markup language documents into memory at one time , Form a... In memory dom Trees
    advantage : It is easy to operate , You can do CRUD All operations
    shortcoming : Occupy memory
  2. SAX: Read line by line , Event driven .
    advantage : Do not occupy memory .
    shortcoming : Can only read , You can't add, delete, or modify

xml Common parsers

  1. JAXP:sun Company supplied parsers , Support dom and sax Two thoughts
  2. DOM4J: A very good parser
  3. Jsoup:jsoup Is a Java Of HTML Parser , Can directly parse a URL Address 、HTML Text content . It provides a very labor-saving API, It can be done by DOM,CSS And similar to jQuery To extract and manipulate data .
  4. PULL:Android The built-in parser of the operating system ,sax The way of .

Jsoup

Quick start

step :

  1. Import jar package
  2. obtain Document object
  3. Get the corresponding label Element object
  4. get data

coordinate :

<!--jsoup--><dependency>  <groupId>org.jsoup</groupId>  <artifactId>jsoup</artifactId>  <version>1.14.3</version></dependency><!--JsoupXpath--><dependency>  <groupId>cn.wanghaomiao</groupId>  <artifactId>JsoupXpath</artifactId>  <version>2.5.1</version></dependency>

Code

//2.1 obtain student.xml Of pathString path = JsoupDemo1.class.getClassLoader().getResource("student.xml").getPath();//2.2 analysis xml file , Load document into memory , obtain dom Trees --->DocumentDocument document = Jsoup.parse(new File(path), "utf-8");//3. Get element object  ElementElements elements = document.getElementsByTag("name");System.out.println(elements.size());//3.1 Get the first one name Of Element object Element element = elements.get(0);//3.2 get data String name = element.text();System.out.println(name);

Use of objects :

Jsoup: Tool class , Can be parsed html or xml file , return Document

parse: analysis html or xml file , return Document

parse(File in, String charsetName): analysis xml or html Of documents .
parse(String html): analysis xml or html character string
parse(URL url, int timeoutMillis): Get the specified... Through the network path html or xml Document object for

Document: Document object . Represents... In memory dom Trees

obtain Element object

getElementById(String id): according to id Property value gets unique element object
getElementsByTag(String tagName): Get the collection of element objects according to the label name
getElementsByAttribute(String key): Get the collection of element objects according to the attribute name
getElementsByAttributeValue(String key, String value): Get the element object set according to the corresponding attribute name and attribute value

Elements: Elements Element A collection of objects . Can be regarded as ArrayList To use

Element: Element object

Get child element object

getElementById(String id): according to id Property value gets unique element object
getElementsByTag(String tagName): Get the collection of element objects according to the label name
getElementsByAttribute(String key): Get the collection of element objects according to the attribute name
getElementsByAttributeValue(String key, String value): Get the element object set according to the corresponding attribute name and attribute value

Get attribute value

String attr(String key): Get the property value according to the property name

Get text content

String text(): Get text content
String html(): Get all the contents of the label body ( Include the string content of the word tag )

Node: Node object

Node yes Document and Element Parent class of
Quick query :

  1. selector: Selectors
    Method used :Elements select(String cssQuery)
    grammar : Reference resources Selector Syntax defined in class
  2. XPath:XPath That is to say XML Path to the language , It's a way to determine XML( A subset of Standard General Markup Languages ) The language of a part of a document

Use Jsoup Of Xpath Need extra import jar package .
Inquire about w3cshool Reference manual , Use xpath The syntax of complete query
Code :

//1. obtain student.xml Of pathString path = JsoupDemo6.class.getClassLoader().getResource("student.xml").getPath();//2. obtain Document object Document document = Jsoup.parse(new File(path), "utf-8");//3. according to document object , establish JXDocument object JXDocument jxDocument = new JXDocument(document);//4. combination xpath Syntax query //4.1 Query all student label List<JXNode> jxNodes = jxDocument.selN("//student");for (JXNode jxNode : jxNodes) {	System.out.println(jxNode);}System.out.println("--------------------");//4.2 Query all student Label under name label List<JXNode> jxNodes2 = jxDocument.selN("//student/name");for (JXNode jxNode : jxNodes2) {	System.out.println(jxNode);}System.out.println("--------------------");//4.3 Inquire about student There is... Under the label id Attribute name label List<JXNode> jxNodes3 = jxDocument.selN("//student/name[@id]");for (JXNode jxNode : jxNodes3) {	System.out.println(jxNode);}System.out.println("--------------------");//4.4 Inquire about student There is... Under the label id Attribute name label   also id The property value is zjqList<JXNode> jxNodes4 = jxDocument.selN("//student/name[@id='zjq']");for (JXNode jxNode : jxNodes4) {	System.out.println(jxNode);}

This is the end of this article ,
If you have any harvest, you are welcome to like, collect and pay attention to ️, Your encouragement is my biggest motivation .
If you have any wrong questions, you are welcome to point out .

Keep loving , Go to the next mountain and sea .

原网站

版权声明
本文为[Hua Weiyun]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206251452018118.html