当前位置:网站首页>How to handle 2gcsv files that cannot be opened? Use byzer
How to handle 2gcsv files that cannot be opened? Use byzer
2022-06-26 15:25:00 【MonkeyKing_ sunyuhua】
Project needs , It is necessary to export the data of the customer's environment for reconciliation analysis , Customer data is secret , Cannot provide a calling interface , You can only csv In the form of documents .
But one 2G Of csv file , Most machines will crash themselves . Another way is to split the tool , But it is troublesome to analyze data after splitting .
There is one Byzer Tools can meet this requirement
Official website address :
Support private environment deployment , Ensure data privacy , The construction environment is as follows :
1、 Prepare one liunx machine , Machine configuration 2 nucleus 8G about
2、 download byzer And install
wget https://download.byzer.org/byzer/2.3.0.1/byzer-lang-all-in-one-linux-amd64-3.1.1-2.3.0.1.tar.gz
tar -zxvf byzer-lang-all-in-one-linux-amd64-3.1.1-2.3.0.1.tar.gz
cd byzer-lang-all-in-one-linux-amd64-3.1.1-2.3.0.1
3、 start-up byzer
./bin/byzer.sh start
4. You can visit byzer, But not very friendly , The supporting visualization tools can be installed notebook

5、 Download and install notebook
wget https://download.byzer.org/byzer-notebook/1.2.0/Byzer-Notebook-1.2.0.tar.gz
tar -xvf Byzer-Notebook-1.2.0.tar.gz
6、notebook rely on mysql, It needs to be installed in advance mysql
have access to docker-compose install , Other installation methods , Baidu for a while
docker-compose.yml file
version: "2"
services:
mysql:
container_name: mysql
image: mysql:5.7
restart: always
volumes:
- ./mysql/data:/var/lib/mysql
- ./mysql/init:/docker-entrypoint-initdb.d
- /etc/localtime:/etc/localtime:ro
ports:
- "3306:3306"
environment:
MYSQL_ROOT_PASSWORD: "XXXXX"
TZ: "Asia/Shanghai"
command: --max_allowed_packet=32505856
7、 adjustment notebook About mysql Configuration of

Path address
cd /home/Byzer-Notebook-1.2.0/conf

8、 start-up notebook
./bin/notebook.sh start

9. After registration, you can access and use

10、 Click upload , Upload your own 2G Of csv file


the reason being that 2G The file of , Upload is slow , Here we need to wait patiently
11、 New notebook , View the file

12、 Will just export csv Convert to table structure
load csv.`/tmp/upload/billing.csv` where header="true" as r3;
If it is xlsx file
load excel.`/tmp/upload/billing.xlsx` where header="true" as r4;
Be careful 
Here are the back quotes , Not single quotes
13、 Now you can view and use
select * from r3 limit 10 as 2022_06_24_r3;
Insert a code chip here
Be careful to take it with you at the back “as The table alias ” This may be byzer A special way of marking .
select sum(BlendedCost) from r3 where payerAccountId=417966497442 as 417966497442_count;
Support all sql A query


Note appended :
If this error occurs , Check the environment JDK To configure

If there is no access 9002,9003 port , Check the security group settings
边栏推荐
- The intersect function in the dplyr package of R language obtains the data lines that exist in both dataframes and the data lines that cross the two dataframes
- ETL过程中数据精度不准确问题
- Restcloud ETL extracting dynamic library table data
- # 粒子滤波 PF——三维匀速运动CV目标跟踪(粒子滤波VS扩展卡尔曼滤波)
- Optimizing for vectorization
- HR export data Excel VBA
- 一键安装gcc脚本
- Talk about the recent situation of several students from Tsinghua University
- 1.会计基础--会计的几大要素(会计总论、会计科目和账户)
- Unity C# 网络学习(九)——WWWFrom
猜你喜欢

Redis-集群

IDEA本地代理后,无法下载插件
MySQL数据库基本SQL语句教程之高级操作

【微信小程序】事件绑定,你搞懂了吗?

在校生学习生涯总结(2022)

TCP congestion control details | 1 summary

【ceph】CEPHFS 内部实现(一):概念篇--未消化

【TcaplusDB知识库】TcaplusDB系统用户组介绍
![[CEPH] cephfs internal implementation (II): example -- undigested](/img/87/6eb214550faf1f0500565c1610ff3b.png)
[CEPH] cephfs internal implementation (II): example -- undigested

English grammar_ Adjective / adverb Level 3 - original sentence pattern
随机推荐
Principle of TCP reset attack
Database - sequence
The heavyweight white paper was released. Huawei continues to lead the new model of smart park construction in the future
Shell script multi process concurrent writing method example (high level cultivation)
小程序:uniapp解决 vendor.js 体积过大的问题
数据库-视图
The DOTPLOT function in the epidisplay package of R language visualizes the frequency of data points in different intervals in the form of point graphs, specifies the grouping parameters with the by p
Comparative analysis of restcloud ETL and kettle
Optimizing for vectorization
一键分析硬件/IO/全国网络性能脚本(强推)
Unity C # e-learning (VIII) -- www
10分钟了解BIM+GIS融合,常见BIM数据格式及特性
设计人员拿到的工程坐标系等高线CAD图如何加载进图新地球
Analysis of ble packet capturing debugging information
【ceph】CephFS 内部实现(二):示例--未消化
【TcaplusDB知识库】TcaplusDB运维单据介绍
There are so many vulnerabilities in tcp/ip protocol?
[tcapulusdb knowledge base] tcapulusdb OMS business personnel permission introduction
Redis-集群
SAP GUI 770 Download