当前位置：网站首页>XML usage and parsing of data storage and transmission files

XML usage and parsing of data storage and transmission files

2022-06-25 16:05:00 【Hua Weiyun】

Concept ：Extensible Markup Language Extensible markup language

Scalable ： The tags are all custom .

function
Store the data
- The configuration file
- To transmit in a network
xml And html The difference between
1. xml The tags are all custom ,html Tags are predefined .
2. xml The grammar is strict ,html Loose grammar
3. xml It's about storing data ,html It's showing data

grammar ：

Basic grammar ：

xml The suffix of the document .xml
xml The first line must be defined as a document declaration
xml There is and only one root tag in the document
Attribute values must use quotation marks ( Single and double ) Lead up
Label must be closed correctly
xml Label names are case sensitive

Quick start ：

	<?xml version='1.0' ?>	<users>		<user id='1'>			<name> Have a drink together </name>			<age>23</age>			<gender>superman</gender>			<br/>		</user>				<user id='2'>			<name>zjq</name>			<age>18</age>			<gender>man</gender>		</user>	</users>

Part of the ：

The document statement

Format ：
Property list ：
version： Version number , Required properties
encoding： Encoding mode . Tells the parsing engine what character set the current document uses , The default value is ：ISO-8859-1
standalone： Is it independent
Value ：
yes： Don't rely on other files
no： Rely on other files

Instructions ： combination css Of

<?xml-stylesheet type="text/css" href="a.css" ?>

label ： Label name custom

The rules ：
Names can contain letters 、 Numbers and other characters
Names cannot begin with numbers or punctuation
The name cannot be in letters xml（ perhaps XML、Xml wait ） Start
The name cannot contain spaces

attribute

id Attribute value is unique

Text

CDATA District ： The data in this area will be displayed as is
Format ： <![CDATA[ data ]]>

constraint ： Regulations xml Rules for writing documents

As users of the framework ( The programmer )：

In the xml Constraint document is introduced in
Be able to read and understand constraint documents easily

classification ：

DTD: A simple constraint technique
Schema: A complex constraint technique

DTD

introduce dtd Document to xml In the document

Inside dtd： Define the constraint rules in xml In the document
external dtd： Define the rules of constraint in the external dtd In file
Local ：<!DOCTYPE Root sign SYSTEM "dtd The location of the file ">
The Internet ：<!DOCTYPE Root sign PUBLIC "dtd File name " "dtd The location of the file URL">

Schema

introduce ：

Fill in xml Root element of the document
introduce xsi Prefix . xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”
introduce xsd File namespace . xsi:schemaLocation=“http://www.zjq.com/xml student.xsd”
For every one xsd Constraints declare a prefix , As identification xmlns=“http://www.zjq.com/xml”

Case study ：
<students xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.zjq.com/xml"
xsi:schemaLocation=“http://www.zjq.com/xml student.xsd”>

analysis ： operation xml file , Read the data in the document into memory

operation xml file

analysis ( Read )： Read the data in the document into memory
write in ： Save the data in memory to xml In the document . Persistent storage

analysis xml The way

DOM： Load markup language documents into memory at one time , Form a... In memory dom Trees
advantage ： It is easy to operate , You can do CRUD All operations
shortcoming ： Occupy memory
SAX： Read line by line , Event driven .
advantage ： Do not occupy memory .
shortcoming ： Can only read , You can't add, delete, or modify

xml Common parsers

JAXP：sun Company supplied parsers , Support dom and sax Two thoughts
DOM4J： A very good parser
Jsoup：jsoup Is a Java Of HTML Parser , Can directly parse a URL Address 、HTML Text content . It provides a very labor-saving API, It can be done by DOM,CSS And similar to jQuery To extract and manipulate data .
PULL：Android The built-in parser of the operating system ,sax The way of .

Jsoup

Quick start

step ：

Import jar package
obtain Document object
Get the corresponding label Element object
get data

coordinate ：

<!--jsoup--><dependency>  <groupId>org.jsoup</groupId>  <artifactId>jsoup</artifactId>  <version>1.14.3</version></dependency><!--JsoupXpath--><dependency>  <groupId>cn.wanghaomiao</groupId>  <artifactId>JsoupXpath</artifactId>  <version>2.5.1</version></dependency>

Code

//2.1 obtain student.xml Of pathString path = JsoupDemo1.class.getClassLoader().getResource("student.xml").getPath();//2.2 analysis xml file , Load document into memory , obtain dom Trees --->DocumentDocument document = Jsoup.parse(new File(path), "utf-8");//3. Get element object  ElementElements elements = document.getElementsByTag("name");System.out.println(elements.size());//3.1 Get the first one name Of Element object Element element = elements.get(0);//3.2 get data String name = element.text();System.out.println(name);

Use of objects ：

Jsoup： Tool class , Can be parsed html or xml file , return Document

parse： analysis html or xml file , return Document

parse(File in, String charsetName)： analysis xml or html Of documents .
parse(String html)： analysis xml or html character string
parse(URL url, int timeoutMillis)： Get the specified... Through the network path html or xml Document object for

Document： Document object . Represents... In memory dom Trees

obtain Element object

getElementById(String id)： according to id Property value gets unique element object
getElementsByTag(String tagName)： Get the collection of element objects according to the label name
getElementsByAttribute(String key)： Get the collection of element objects according to the attribute name
getElementsByAttributeValue(String key, String value)： Get the element object set according to the corresponding attribute name and attribute value

Elements： Elements Element A collection of objects . Can be regarded as ArrayList To use

Element： Element object

Get child element object

Get attribute value

String attr(String key)： Get the property value according to the property name

Get text content

String text(): Get text content
String html(): Get all the contents of the label body ( Include the string content of the word tag )

Node： Node object

Node yes Document and Element Parent class of
Quick query ：

selector: Selectors
Method used ：Elements select(String cssQuery)
grammar ： Reference resources Selector Syntax defined in class
XPath：XPath That is to say XML Path to the language , It's a way to determine XML（ A subset of Standard General Markup Languages ） The language of a part of a document

Use Jsoup Of Xpath Need extra import jar package .
Inquire about w3cshool Reference manual , Use xpath The syntax of complete query
Code ：

//1. obtain student.xml Of pathString path = JsoupDemo6.class.getClassLoader().getResource("student.xml").getPath();//2. obtain Document object Document document = Jsoup.parse(new File(path), "utf-8");//3. according to document object , establish JXDocument object JXDocument jxDocument = new JXDocument(document);//4. combination xpath Syntax query //4.1 Query all student label List<JXNode> jxNodes = jxDocument.selN("//student");for (JXNode jxNode : jxNodes) {	System.out.println(jxNode);}System.out.println("--------------------");//4.2 Query all student Label under name label List<JXNode> jxNodes2 = jxDocument.selN("//student/name");for (JXNode jxNode : jxNodes2) {	System.out.println(jxNode);}System.out.println("--------------------");//4.3 Inquire about student There is... Under the label id Attribute name label List<JXNode> jxNodes3 = jxDocument.selN("//student/name[@id]");for (JXNode jxNode : jxNodes3) {	System.out.println(jxNode);}System.out.println("--------------------");//4.4 Inquire about student There is... Under the label id Attribute name label   also id The property value is zjqList<JXNode> jxNodes4 = jxDocument.selN("//student/name[@id='zjq']");for (JXNode jxNode : jxNodes4) {	System.out.println(jxNode);}

This is the end of this article ,
If you have any harvest, you are welcome to like, collect and pay attention to ️, Your encouragement is my biggest motivation .
If you have any wrong questions, you are welcome to point out .