当前位置:网站首页>Installation and deployment of alluxio
Installation and deployment of alluxio
2022-06-26 05:18:00 【Air transport Alliance】
Alluxio Installation and deployment of
Development of guidelines - Alluxio v2.6.2 (stable) Documentation
First, determine the deployment environment , You can choose to deploy locally 、 colony 、AWS etc. . That is to choose Alluxio Underlying storage , Is to use the local file system 、HDFS、S3 etc. .
Tips : It can be downloaded from Alluxio The download page Get published version . Every Alluxio The release versions all offer different Hadoop Version compatible precompiled binaries . from Master Branch building Alluxio page Explains how to compile and generate from source code Alluxio project .
1. The basic requirements
The following is running in local or cluster mode Alluxio Basic requirements :
- Cluster nodes need to run on one of the following operating systems :
- MacOS 10.10 Or later
- CentOS - 6.8 or 7
- RHEL - 7.x
- Ubuntu - 16.04
- Alluxio need JDK 8. Higher versions are not supported :
- Java JDK 8(Oracle or OpenJDK The distribution supports )
- Alluxio Support only IPv4 Network protocol
- Open the following ports and protocols
- Inbound TCP 22 - As user ssh Enter the specified node to install Alluxio Components .
Master requirement
Here's the run Alluxio Master The configuration required by the cluster node of the process .
Note that these are the minimum requirements for operation . Large scale operation under high load Alluxio The corresponding system requirements will increase .
- least 4 GB Hard disk space
- least 4 GB Memory
- least 4 individual CPU nucleus
- Open the following ports and protocols :
- Inbound TCP 19998-Alluxio master Default RPC port
- Inbound TCP 19999-Alluxio master Default web UI port :
http://<master-hostname>:19999
- Inbound TCP 20001-Alluxio job master Default RPC port
- Inbound TCP 20002-Alluxio job master The default network UI port
- Embedded Journal requirement
- Inbound TCP 19200-Alluxio master For the interior leader The default port of the election
- Inbound TCP 20003-Alluxio job master For the interior leader The default port of the election
Worker requirement
Here's the run Alluxio Worker The configuration required by the cluster node of the process .
- Minimum 1 GB Hard disk space
- least 1 GB Memory
- least 2 individual CPU nucleus
- Open the following ports and protocols :
- Inbound TCP 29999-Alluxio worker Default RPC port
- Inbound TCP 30000-Alluxio worker The default network UI port :
http://<worker-hostname>:30000
- Inbound TCP 30001-Alluxio job worker Default RPC port
- Inbound TCP 30002-Alluxio job worker Default data port for
- Inbound TCP 30003-Alluxio job worker The default network UI port :
http://<worker-hostname>:30003
Worker Cache
Need to be for Alluxio Workers Configure storage space as cache . By default Alluxio by Worker Provide a RAMFS, But it can be modified to use the... Of other storage volumes . By means of alluxio.worker.tieredstore.level%d.dirs.path
Other directories are available in , Users can specify Alluxio Use storage media and directories that are different from the default configuration . For users who want to start with the default assignment , Use any sudo Permission account to run commands ./bin/alluxio-mount.sh SudoMount worker
. Note that the above command should be completed after alluxio-site.properties
Set in file alluxio.worker.ramdisk.size
And put all workers Add to conf/workers
Run after file .
$ ./bin/alluxio-mount.sh SudoMount workers
Proxy requirement
Proxy The process provides a REST The client of , need :
- least 1 GB Memory
- Open the following ports and protocols :
- Inbound TCP 39999- clients To access Proxy node .
Fuse requirement
Here is Alluxio For operation fuse Process node requirements
Note that these are run Alluxio Minimum software requirements . Run under large load Alluxio Fuse Will increase system requirements .
- least 1 individual CPU nucleus
- least 1 GB Memory
- already installed Fuse
- libfuse 2.9.3 Or later ( Apply to Linux)
- osxfuse 3.7.1 Or later ( Apply to MacOS)
Other requirements
Alluxio You can also summarize logs to a remote server for unified viewing . Here are Logging Server Port and resource requirements for .
Remote Logging Server requirement
Here is Alluxio For operation Remote Logging Server requirement :
- least 1 GB Hard disk space
- least 1 GB Memory
- least 2 individual CPU nucleus
- Open the following ports and protocols :
- Inbound TCP 45600 - So that the logger can write logs to the server .
2. The local Alluxio Installation configuration
Use the local file system as the underlying storage .
Download installation package :Try Alluxio in the cloud or download/install where you want it
1) To configure Alluxio
$ tar -xzf alluxio-bin.tar.gz
$ cd alluxio-2.6.2
# Create from template file conf/alluxio-site.properties The configuration file .
$ cp conf/alluxio-site.properties.template conf/alluxio-site.properties
# stay conf/alluxio-site.properties file
alluxio.master.hostname=localhost
alluxio.worker.ramdisk.size=1GB # This memory cannot exceed the actual maximum memory of the system
$ cp alluxio-env.sh.template alluxio-env.sh
# Appoint java route
JAVA_HOME=/usr/java/jdk1.8.0_301
2) mount RAMFS file system
$ sudo ./bin/alluxio-mount.sh SudoMount
3) format Alluxio file system
Be careful : This step can only be run for the first time Alluxio The system only needs to execute . If the user is in the deployed Alluxio Run the format command on the cluster , Previously saved on the current server Alluxio All data and metadata of the file system will be cleared . however , The underlying data does not change .
$ sudo ./bin/alluxio format
$ ./alluxio validateEnv local # Check the operating environment
4) Local boot Alluxio file system
Simply run the following command to start Alluxio file system .
# If you have not already mounted ramdisk Or re mount ( For example, to change ramdisk size )
$ sudo ./bin/alluxio-start.sh local SudoMount
# perhaps , If already installed ramdisk
$ sudo ./bin/alluxio-start.sh local
5) verification Alluxio Whether to run
[[email protected] alluxio-2.6.0]# jps
5059 AlluxioProxy
5688 Jps
4377 AlluxioJobMaster
4268 AlluxioMaster
5053 AlluxioWorker
5055 AlluxioJobWorker
In order to confirm Alluxio In operation , You can visit http://localhost:19999 see Alluxio master Operating state , visit http://localhost:30000 see Alluxio worker Operating state .
Tips : If external pass IP visit , Access failure may be a problem with the firewall
Run a more comprehensive system integrity check :
$ ./bin/alluxio runTests
The following command can be executed at any time to close Alluxio:
$ ./bin/alluxio-stop.sh local
6) Use Alluxio Shell
Alluxio shell Contains a variety of and Alluxio Interactive command line operations . If you want to view the list of file system operation commands , function :
$ ./bin/alluxio fs
You can go through ls
Command lists Alluxio Files in . For example, list all the files in the root directory :
$ ./bin/alluxio fs ls /
at present Alluxio There are no documents in the .copyFromLocal
The command can copy local files to Alluxio in .
$ ./bin/alluxio fs copyFromLocal LICENSE /LICENSE
Copied LICENSE to /LICENSE
List again Alluxio Files in , You can see the just copied LICENSE
file :
$ ./bin/alluxio fs ls /
-rw-r--r-- staff staff 26847 NOT_PERSISTED 01-09-2018 15:24:37:088 100% /LICENSE
The output shows LICENSE The file in Alluxio in , It also contains some other useful information , For example, the size of the file 、 Date created 、 The owner and group of the file and Alluxio The cache percentage of this file in .
cat
The command can print the contents of a file .
$ ./bin/alluxio fs cat /LICENSE
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
...
By default ,Alluxio Use the local file system as the underlying file system (UFS). default UFS The path is a ./underFSStorage
. We can see UFS The content in :
$ ls ./underFSStorage/
However , This directory does not exist ! This is because Alluxio By default, only data is written to Alluxio Storage space , Instead of writing UFS.
however , We can tell Alluxio The file from Alluxio Space Persistence To UFS.shell command persist
It can be done .
$ ./bin/alluxio fs persist /LICENSE
persisted file /LICENSE with size 26847
If we check again now UFS, The document will appear .
$ ls ./underFSStorage
LICENSE
7)[ pluses ] Alluxio Mount function in
For configuration Alluxio And Amazon S3 Interaction , Please be there.
conf/alluxio-site.properties
In the document Alluxio Configuration add AWS Access information . The following command will update the configuration .After configuration modification , Synchronize to each node , Restart again
$ echo "aws.accessKeyId=<AWS_ACCESS_KEY_ID>" >> conf/alluxio-site.properties $ echo "aws.secretKey=<AWS_SECRET_ACCESS_KEY>" >> conf/alluxio-site.properties
You have to **
<AWS_ACCESS_KEY_ID>
Replace it with your AWS access key id, take<AWS_SECRET_ACCESS_KEY>
** Replace it with your AWS secret access key.
Alluxio Unified access to the storage system through the unified namespace . You can read Unified namespace blog and Unified namespace document Get a more detailed explanation .
This feature allows users to mount different storage systems to Alluxio Namespace and through Alluxio Namespaces seamlessly access files across storage systems .
First , We are Alluxio Create a directory as a mount point in .
(base) [[email protected] alluxio-2.6.0]# alluxio fs mkdir /mnt
Successfully created directory /mnt
next , We mount an existing S3 bucket To Alluxio. This guide uses alluxio-quick-start
S3 bucket.
$ ./bin/alluxio fs mount --readonly alluxio://localhost:19998/mnt/s3 s3://alluxio-quick-start/data
Mounted s3://alluxio-quick-start/data at alluxio://localhost:19998/mnt/s3
If not previously configured aws Access key , You can specify... When mounting :
./bin/alluxio fs mount --option aws.accessKeyId=<accessKeyId> --option aws.secretKey=<secretKey> /mnt/s3 s3://data-bucket/
example :
(base) [[email protected] alluxio-2.6.0]# alluxio fs mount --option aws.accessKeyId=******* --option aws.secretKey=*************** /mnt/s3 s3://alluxio-quick-start/data
Mounted s3://alluxio-quick-start/data at /mnt/s3
(base) [[email protected] alluxio-2.6.0]# alluxio fs ls /mnt/s3
-r-x------ song_jie0109 song_jie0109 10077271 PERSISTED 06-21-2016 02:03:30:000 0% /mnt/s3/sample_tweets_10m.csv
-r-x------ song_jie0109 song_jie0109 955610 PERSISTED 06-21-2016 02:03:22:000 0% /mnt/s3/sample_tweets_1m.csv
-r-x------ song_jie0109 song_jie0109 89964 PERSISTED 06-21-2016 02:03:45:000 0% /mnt/s3/sample_tweets_100k.csv
-r-x------ song_jie0109 song_jie0109 157046046 PERSISTED 06-21-2016 02:03:45:000 0% /mnt/s3/sample_tweets_150m.csv
We can go through Alluxio Namespace lists S3 Documents in . Use familiar ls
Command lists S3 Mount the files in the directory .
$ ./bin/alluxio fs ls /mnt/s3
-r-x------ staff staff 955610 PERSISTED 01-09-2018 16:35:00:882 0% /mnt/s3/sample_tweets_1m.csv
-r-x------ staff staff 10077271 PERSISTED 01-09-2018 16:35:00:910 0% /mnt/s3/sample_tweets_10m.csv
-r-x------ staff staff 89964 PERSISTED 01-09-2018 16:35:00:972 0% /mnt/s3/sample_tweets_100k.csv
-r-x------ staff staff 157046046 PERSISTED 01-09-2018 16:35:01:002 0% /mnt/s3/sample_tweets_150m.csv
The newly attached files and directories can also be found in Alluxio web UI see .
adopt Alluxio Unified namespace , You can seamlessly exchange data from different storage systems . for instance , Use ls -R
command , You can recursively list all the files in a directory .
$ ./bin/alluxio fs ls -R /
-rw-r--r-- staff staff 26847 PERSISTED 01-09-2018 15:24:37:088 100% /LICENSE
drwxr-xr-x staff staff 1 PERSISTED 01-09-2018 16:05:59:547 DIR /mnt
dr-x------ staff staff 4 PERSISTED 01-09-2018 16:34:55:362 DIR /mnt/s3
-r-x------ staff staff 955610 PERSISTED 01-09-2018 16:35:00:882 0% /mnt/s3/sample_tweets_1m.csv
-r-x------ staff staff 10077271 PERSISTED 01-09-2018 16:35:00:910 0% /mnt/s3/sample_tweets_10m.csv
-r-x------ staff staff 89964 PERSISTED 01-09-2018 16:35:00:972 0% /mnt/s3/sample_tweets_100k.csv
-r-x------ staff staff 157046046 PERSISTED 01-09-2018 16:35:01:002 0% /mnt/s3/sample_tweets_150m.csv
The output shows Alluxio Under the root directory of the file system, all files from the mounted storage system ./LICENSE
The file is in the local file system ,/mnt/s3/
Directory in S3 in .
8)[ pluses ] use Alluxio Accelerate data access
because Alluxio Use memory to store data , It can speed up data access . First , Let's take a look at the previous S3 Mount to Alluxio Status of a file in :
$ ./bin/alluxio fs ls /mnt/s3/sample_tweets_150m.csv
-r-x------ staff staff 157046046 PERSISTED 01-09-2018 16:35:01:002 0% /mnt/s3/sample_tweets_150m.csv
The output shows the file Not In Memory( Not in memory ). This file is a sample of twitter . Let's count how many tweets mention words “kitten”, And calculate the time consumption of this operation .
$ time ./bin/alluxio fs cat /mnt/s3/sample_tweets_150m.csv | grep -c kitten
889
real 0m22.857s
user 0m7.557s
sys 0m1.181s
It depends on your network connection , This operation may exceed 20 second . If it takes too long to read the file , You can choose a smaller data set . Other files in this directory are a smaller subset of this file . By putting data in memory ,Alluxio It can improve the speed of accessing the data .
Through cat
Command to get the file , You can use it. ls
Command to view the status of the file :
$ ./bin/alluxio fs ls /mnt/s3/sample_tweets_150m.csv
-r-x------ staff staff 157046046 PERSISTED 01-09-2018 16:35:01:002 100% /mnt/s3/sample_tweets_150m.csv
The output shows that the file has 100% Be loaded to Alluxio in , In that case , It should be much faster to access the file again .
Now let's count the ownership “puppy” The number of tweets for this word .
$ time ./bin/alluxio fs cat /mnt/s3/sample_tweets_150m.csv | grep -c puppy
1553
real 0m1.917s
user 0m2.306s
sys 0m0.243s
As you can see , Because the data has been stored in Alluxio It's in memory , The subsequent reading of the same file is very fast .
Now let's count how many tweets contain “bunny” The word .
$ time ./bin/alluxio fs cat /mnt/s3/sample_tweets_150m.csv | grep -c bunny
907
real 0m1.983s
user 0m2.362s
sys 0m0.240s
Congratulations ! You installed... Locally Alluxio And through Alluxio Accelerated data access !
close Alluxio
You can use the following command to close Alluxio:
$ ./bin/alluxio-stop.sh local
3. Deployment on Cluster Alluxio
Deploy an Alluxio Cluster with a Single Master - Alluxio v2.6.0 Documentation
3.1 Single master node cluster
The simplest deployment Alluxio On the cluster , Prone to single point of failure .
Premise :
(1) Each node can be accessed without secret
(2) Make sure that all nodes RPC Port open ( Default :19998)
(3) Assign... To the operating user sudo Authority , mount RAMFS The need when .
1) Download decompression
(base) [[email protected] ~]# tar -xvf alluxio-2.6.0-bin.tar.gz
2) Modify the configuration file
(base) [[email protected] ~]# cd alluxio-2.6.0/
(base) [[email protected] alluxio-2.6.0]# cd conf/
(base) [[email protected] conf]# cp alluxio-site.properties.template alluxio-site.properties
(base) [[email protected] conf]# vim alluxio-site.properties
For example, the modified content is as follows :
alluxio.master.hostname=clu00
alluxio.master.mount.table.root.ufs=hdfs://clu00:9090/alluxio
i)hostname Refers to the primary node of the cluster , have access to IP, You can also use domain names ( It is required that all child nodes can access )
ii) alluxio.master.mount.table.root.ufs Appoint Alluxio Mount the root of the storage URL
For example, when HDFS is used as the under storage system, the value of this property can be set to alluxio.master.mount.table.root.ufs=hdfs://1.2.3.4:9000/alluxio/root/
When Amazon S3 is used as the under storage system, the value can be set to alluxio.master.mount.table.root.ufs=s3://bucket/dir/
Tips :master Configuration properties alluxio.master.mount.table.root.ufs
Mount the specified directory to Alluxio Namespace root ( finger Alluxio Basic storage space ), This directory represents Alluxio Of ”primary storage”. On this basis , Users can mount API Add and remove ( It refers to mounting multiple underlying storage ).
3) Appoint java Environmental Science
(base) [[email protected] conf]# cp alluxio-env.sh.template alluxio-env.sh
(base) [[email protected] conf]# vim alluxio-env.sh
JAVA_HOME=/usr/java/jdk1.8.0_301
4) Set master node and work node
(base) [[email protected] conf]# vim masters
clu00
(base) [[email protected] conf]# vim workers
clu01
clu02
These are the minimum configurations required for startup , Other detailed configurations are as follows configuration properties reference
- You may need to set additional properties to enable Alluxio to access the configured under storage (eg., AWS S3 configuration)
5) Node synchronization configuration
(base) [[email protected] bin]# ./alluxio copyDir /root/alluxio-2.6.0
RSYNC'ing /root/alluxio-2.6.0 to masters... clu00 RSYNC'ing /root/alluxio-2.6.0 to workers...
clu01
clu02
6) format
Before the first start , The primary node should be formatted Alluxio, All metadata information will be deleted , But it will not affect the data stored at the bottom .
(base) [[email protected] alluxio-2.6.0]# ./bin/alluxio formatMasters
Tips : If an error occurs , View the log information and modify it accordingly ,JAVA Environmental issues require env.sh It is specified in
7) start-up
(base) [[email protected] alluxio-2.6.0]# ./bin/alluxio-start.sh all SudoMount
# all Will start master Nodes and all workers node
# SudoMount Parameters will help workers The node is mounted to RamFS On , Only initial startup requires mounting
......
All tasks finished
-----------------------------------------
Starting to monitor all remote services.
-----------------------------------------
--- [ OK ] The master service @ clu00 is in a healthy state.
--- [ OK ] The job_master service @ clu00 is in a healthy state.
--- [ OK ] The worker service @ clu00 is in a healthy state.
--- [ OK ] The worker service @ clu02 is in a healthy state.
--- [ OK ] The worker service @ clu01 is in a healthy state.
--- [ OK ] The job_worker service @ clu02 is in a healthy state.
--- [ OK ] The job_worker service @ clu01 is in a healthy state.
--- [ OK ] The job_worker service @ clu00 is in a healthy state.
--- [ OK ] The proxy service @ clu01 is in a healthy state.
--- [ OK ] The proxy service @ clu02 is in a healthy state.
--- [ OK ] The proxy service @ clu00 is in a healthy state.
# start-up
./bin/alluxio-start.sh all
# close
./bin/alluxio-stop.sh all
$ ./bin/alluxio-start.sh masters # starts all masters in conf/masters
$ ./bin/alluxio-start.sh workers # starts all workers in conf/workers
$ ./bin/alluxio-start.sh master # starts the local master
$ ./bin/alluxio-start.sh worker # starts the local worker
8) verification Alluxio colony
Web access visit http://<alluxio_master_hostname>:19999
see master node
Web access visit http://<alluxio_worker_hostname>:30000
see worker node
#Master I can see AlluxioMaster、AlluxioJobMaster、AlluxioProxy
(base) [[email protected] alluxio-2.6.0]# jps
26578 AlluxioProxy
27190 Jps
15670 NameNode
25515 AlluxioMaster
26014 AlluxioJobMaster
#Worker I can see AlluxioWorker、AlluxioJobWorker、AlluxioProxy
[[email protected] ~]# jps
22657 DataNode
25250 AlluxioWorker
25477 AlluxioJobWorker
26151 Jps
25759 AlluxioProxy
边栏推荐
- Transport layer TCP protocol and UDP protocol
- Zuul 实现动态路由
- 创建 SSH 秘钥对 配置步骤
- Two step processing of string regular matching to get JSON list
- 线程优先级
- Douban top250
- 【Unity3D】刚体组件Rigidbody
- Baidu API map is not displayed in the middle, but in the upper left corner. What's the matter? Resolved!
- How does P2P technology reduce the bandwidth of live video by 75%?
- Happy New Year!
猜你喜欢
Red team scoring method statistics
86. (cesium chapter) cesium overlay surface receiving shadow effect (gltf model)
[red team] what preparations should be made to join the red team?
-Discrete Mathematics - Analysis of final exercises
Ai+ remote sensing: releasing the value of each pixel
Codeforces Round #802 (Div. 2)(A-D)
2. < tag dynamic programming and conventional problems > lt.343 integer partition
Windows下安装Tp6.0框架,图文。Thinkphp6.0安装教程
Henkel database custom operator '~~‘
cartographer_backend_constraint
随机推荐
Baidu API map is not displayed in the middle, but in the upper left corner. What's the matter? Resolved!
[geek] product manager training camp
cartographer_backend_constraint
apktool 工具使用文档
Zuul 实现动态路由
86. (cesium chapter) cesium overlay surface receiving shadow effect (gltf model)
cartographer_local_trajectory_builder_2d
Happy New Year!
线程优先级
Recursively traverse directory structure and tree presentation
Tensorflow visualization tensorboard "no graph definition files were found." error
cartographer_ local_ trajectory_ builder_ 2d
Excellent learning ability is your only sustainable competitive advantage
Sentimentin tensorflow_ analysis_ layer
Day4 branch and loop jobs
PHP之一句话木马
Setting pseudo static under fastadmin Apache
LeetCode 19. Delete the penultimate node of the linked list
Guanghetong and anti international bring 5g R16 powerful performance to the AI edge computing platform based on NVIDIA Jetson Xavier nx
RESNET practice in tensorflow