当前位置:网站首页>Install MariaDB columnstore (version 10.3)
Install MariaDB columnstore (version 10.3)
2022-07-24 11:34:00 【Yabingshi】
One ColumnStore framework
MariaDB ColumnStore It is a columnar storage engine using large-scale parallel distributed data architecture , For example, for big data analysis . It is an in-line storage system , By way of InfiniDB 4.6.7 Migration to MariaDB structure .
from MariaDB 10.5.4 Start , It can be used as MariaDB Server's storage engine . Before that , It can only be downloaded separately .
It is designed for big data expansion , Processable number PB The data of 、 Linear scalability and excellent performance , And real-time response to analysis query . It makes use of columnar storage 、 Compress 、 Instant projection and horizontal and vertical zoning I/O advantage , Provides tremendous performance when analyzing large datasets .
MariaDB ColumnStore Deployment consists of multiple MariaDB Server composition , Run as a module , Work together to provide linear scalability and superior performance , And respond to analysis and query in real time . These modules include User, Performance and Storage

/*
Columnar storage organizes the data of each column .
Comparison between row storage and column storage :

*/
User module
User module yes MariaDB Server instance , Configured as ColumnStore The front end of the system runs .
The server runs many additional processes to handle concurrent extensions . When the client queries the server , The storage engine passes the query to one of these processes , Then these processes decompose SQL Request and distribute parts to one or more Performance module To process queries and read from storage . The user module then collects the query results and assembles them into the result set to return to the client
Performance module
Performance module Responsible for the storage 、 Retrieve and manage data , Processing block requests for query operations , And pass it back to the user module or module to complete the query request .
The module selects data from the disk and caches it in a non shared buffer , The buffer is part of the server on which it runs . You can configure any number of performance modules as needed . Each add-on increases the cache size of the entire database and the processing power available to you .
process
Manage and monitor processes
Process manager or ProcMgr Is responsible for starting 、 Monitor and restart all on the performance module MariaDB The process of the column storage process .
To achieve this ,ProcMgr Use process monitor or on each performance module ProcMon To keep track of MariaDB ColumnStore process .
Processing queries
Main process or PrimProc Processing queries . User module The query from the application is processed into instructions sent to the performance module .PrimProc Treat these instructions as block oriented I/O Operation execution , Filter by executing predicates 、 Connection processing and initial aggregation of data , after PrimProc Send the data back to the user module .
Perform load and write
The performance module handles the loading and writing of the underlying persistent storage . It uses two processes to deal with this problem :WriteEngineServer and cpimport.
WriteEngineServer Coordinate... On each performance module DML、DDL And import .DDL The changes remain unchanged in the system directory , This directory tracks all columns and stores metadata .
Both the user and the performance module use cpimport. On the performance module , It updates the database file when bulk data is loaded . This allows ColumnStore Support full parallel loading .
No shared data cache
The performance module uses no shared data cache . When it first accesses data , It will follow the user module Operate on the data according to the instructions of , And cache it based on LRU In the buffer of , For subsequent visits .
When the performance module is running on a dedicated server , You can dedicate most of the available data to this data cache . Because the performance module cache is a shareless design :
- There are no data blocks between participating performance module nodes ping( Sometimes in other instances / Occurs in a shared disk database system ).
- The more performance module nodes are added to the system , The larger the overall cache size of the database .
Fail over
The use of multiple Performance Module Node deployment MariaDB ColumnStore when , The heartbeat mechanism ensures that all nodes are online , And transparent failover when a specific node fails , No human intervention is required .
When the failed performance module comes back online ,ColumnStore Will automatically re apply it to the configuration and start using it to work .
Storage module
You can use local storage ( namely Performance module ) Or shared storage ( for example SAN) To store data . stay Amazon EC2 Environment , You can use temporary or elastic blocks to store (EBS) volume . When data redundancy is required for non shared deployment , You can compare it with GlusterFS Integrate .
Storage architecture
When you are in MariaDB ColumnStore When creating a table on , The system will create at least one file for each column in the table .
You write ColumnStore The data of the table is stored in DB Roots Medium Performance Modules in , These modules are located in /usr/local/mariadb/columnstore/datax

ColumnStore System libraries
Database | Description |
calpontsys | maintain ColumnStore surface Of Metadata Information |
infinidb_querystats | Information about query performance |
infinidb_vtable | Database used to create temporary tables during query execution . The database exists only in ColumnStore 1.2 And below . In these versions , perform ColumnStore The query user must CREATE TEMPORARY TABLE Have permissions on this database . |
columnstore_info | Used to retrieve information about ColumnStore Database of usage information . |
Two install 10.3 edition Of mariadb columnStore
Because we use 10.3.18 Version of mariadb, So here is a version of columnStore.
2.1 Install dependency packages
yum -y install epel-release jemalloc boost
2.2 install columnStore
2.2.1 download
Download address :
2.2.2 install
tar -xvf mariadb-columnstore-1.2.5-1-centos7.x86_64.bin.tar.gz -C /usr/local/
/usr/local/mariadb/columnstore/bin/post-install

2.2.3 To configure
/usr/local/mariadb/columnstore/bin/postConfigure

The others I chose to enter .
The final output as follows , Indicates successful installation :

source /usr/local/mariadb/columnstore/bin/columnstoreAlias
# You can see columnStore It takes up a lot of ports :

2.3 Connect columnstore
2.3.1 mcsadmin

2.3.2 mcsmysql
create database mcs;
use mcs;
create table idbtest(col1 int, col2 int) engine=columnstore;
show create table idbtest;
insert into idbtest values (1, 2);
insert into idbtest values (3, 4);
select * from idbtest;

3、 ... and Configure password authentication
Because the default password is empty , You can log in without secret , So you need to set the password
update mysql.user set password=password('123456') where user='root';
# Delete anonymous users
delete from mysql.user where user='';
flush privileges;
# Create an administrator account that allows remote connections
grant all privileges on *.* to 'root'@'%' identified by '123456';
After setting up the account , You can use the database connection tool to connect columnStore 了 ( Connection mode and common connection mysql equally )
# Create a normal account
A little
# Authorize... For ordinary accounts
ColumnStore Use the name infinidb_vtable To create a dedicated library for ColumnStore All temporary tables for query processing . By default ,root The user account has been granted permissions to this account , However, all user accounts must be granted permissions to this library :
GRANT CREATE TEMPORARY TABLES ON infinidb_vtable.* TO user ;
Four verification columnStore performance
and mariadb Compare the performance .
4.1 take mariadb The data in is imported into columnstore in
May refer to take mariadb The data in is imported into columnstore in _ Yabingshi's blog -CSDN Blog
4.2 Executive statistics sql
SELECT COUNT(*) FROM statistic_login_day_teacher
stay columnstore You just need 0.97 second .
stay mariadb Li Yao 2 And a half .
SELECT platform_code,COUNT(*) FROM statistic_login_day_teacher
GROUP BY platform_code
stay columnstore Just... In the 5 second . stay mariadb It's two and a half minutes .
If you only check some fields , Find out columnstore faster :
SELECT id,area_id,platform_code,school_name FROM statistic_login_day_teacher WHERE DATE_FORMAT(DATE,'%Y%m%d')='20220202'
Column store Two points 19 second .
Mariadb 4 branch 42 second .
stay mariadb And columnStore Execute the same query separately in sql, Query all the data :
SELECT * FROM statistic_login_day_teacher WHERE DATE_FORMAT(DATE,'%Y%m%d')='20220202'
Mariadb Calendar hour 7 minute ,columnStore Calendar hour 10 minute , More slowly .
4.3 summary
If you often do statistics , Such as aggregation operation or only counting some fields , use columnStore faster .
If you want to query all data row by row ,mariadb faster .
This article refers to
MariaDB ColumnStore - MariaDB Knowledge Base
About MariaDB ColumnStore - MariaDB Knowledge Base
Preparing for ColumnStore Installation - 1.2.5 - MariaDB Knowledge Base
边栏推荐
- Blue Bridge Cup provincial match training camp - Calculation of date
- Hash - 242. valid alphabetic ectopic words
- Directional crawling Taobao product name and price (teacher Songtian)
- tcp 服务端接收数据处理思路梳理,以及select: Invalid argument报错 笔记
- [golang] golang realizes sending wechat service number template messages
- [deserialization vulnerability-01] Introduction to serialization and deserialization
- Shell script "< < EOF" my purpose and problems
- 【Golang】golang实现简单memcache
- 16 tips for system administrators to use iptables
- Online customer service chat system source code_ Beautiful and powerful golang kernel development_ Binary operation fool installation_ Construction tutorial attached
猜你喜欢
](/img/1f/37c5548ce84b6a217b4742431f1cc4.png)
运算放大器 —— 快速复苏笔记[壹](参数篇)

Nodejs ctf 基础

Two important laws about parallelism

Video playback | how to become an excellent reviewer of international journals in the field of Geoscience and ecology?
![[markdown grammar advanced] make your blog more exciting (IV: set font style and color comparison table)](/img/a5/c92e0404c6a970a62595bc7a3b68cd.gif)
[markdown grammar advanced] make your blog more exciting (IV: set font style and color comparison table)

Semaphore详解

Reprint of illustrations in nature, issue 3 - area map (part2-100)

Recommended SSH cross platform terminal tool tabby

Pytorch learning -- using gradient descent method to realize univariate linear regression

JVM visualvm: multi hop fault handling tool
随机推荐
C language programming code
一周精彩内容分享(第13期)
cgo+gSoap+onvif学习总结:9、go和c进行socket通信进行onvif协议处理
2 万字详解,吃透 ES!
Linked list - Sword finger offer interview question 02.07. linked list intersection
Blue Bridge Cup - binary conversion exercise
Is there any charge for PDF processing? impossible!
How to choose sentinel vs. hystrix current limiting?
String - 541. Reverse string II
Reprint of illustrations in nature, issue 3 - area map (part2-100)
Leetcode 112. 路径总和
[golang] deletion and emptying of map elements in golang
Blue Bridge Cup provincial match training camp - Calculation of date
Operational amplifier - Notes on rapid recovery [II] (application)
RetinaNet:Focal Loss for Dense Object Detection
IT圈中的Bug的类型与历史
[golang] golang realizes sending wechat service number template messages
黑马瑞吉外卖之员工信息分页查询
String -- 344. Reverse string
网络爬虫之短信验证