当前位置:网站首页>Shengxin visualization (Part4) -- correlation diagram
Shengxin visualization (Part4) -- correlation diagram
2022-06-22 06:06:00 【GoatGui】
Learning notes , For reference only , If there is a mistake, it must be corrected
List of articles
Shengxin visualization
Correlation scatter
The correlation scatter can show two gene The correlation between .
Input data format :
among , Row representation gene, Tabulated sample .
Code :
library(ggplot2)
library(ggpubr)
library(ggExtra)
inputFile="./22.cor/input.txt"
gene1="MTMR14" # The first gene name
gene2="PRKCD" # The second gene name
# Read input file , Extract gene expression amount
rt=read.table(inputFile,sep="\t",header=T,check.names=F,row.names=1)
x=as.numeric(rt[gene1,])
y=as.numeric(rt[gene2,])
# correlation analysis
df1=as.data.frame(cbind(x,y))
corT=cor.test(x,y,method="spearman")
cor=corT$estimate
pValue=corT$p.value
ggplot(df1, aes(x, y)) +
xlab(gene1)+ylab(gene2)+
geom_point()+ geom_smooth(method="lm",formula = y ~ x) + theme_bw()+
stat_cor(method = 'spearman', aes(x =x, y =y))
The following figure ,x Axis representation MTMR14 Gene expression ,y Axis representation PRKCD Gene expression , The blue line in the figure represents the regression line . At the top left R R R Indicates the correlation coefficient , p p p Is the correlation coefficient P value .

Correlation heat map
A correlation heat map can show multiple... In one graph gene The correlation between .
Input data format :
among , Row representation gene, Tabulated sample .
Code :
library(corrplot)
inputFile="./23.corrplot/input.txt"
rt=read.table(inputFile,sep="\t",header=T,row.names=1) # Read the file
rt=t(rt) # Data transpose
M=cor(rt) # Correlation matrix
# Draw correlation graph
corrplot(M,
method = "circle",
order = "hclust", # clustering
type = "upper",
col=colorRampPalette(c("green", "white", "red"))(50)
)
The following figure , The redder the circle, the two gene The correlation coefficient between them tends to 1, The greener the circle, the two gene The correlation coefficient between them tends to -1.
Correlation network diagram
Input data format :
among , Row representation gene, Tabulated sample .
Code :
library(igraph)
library(reshape2)
inputFile="./25.corNetwork/input.txt"
cutoff=0.4 # Correlation threshold
# Read input file
data=read.table(inputFile,header=T,sep="\t",row.names=1,check.names=F)
cordata=cor(t(data))
# Keep half of the correlation matrix
mydata = cordata
upper = upper.tri(mydata)
mydata[upper] = NA
# Transform the correlation matrix into a data frame
df = data.frame(gene=rownames(mydata),mydata)
dfmeltdata = melt(df,id="gene")
dfmeltdata = dfmeltdata[!is.na(dfmeltdata$value),]
dfmeltdata = dfmeltdata[dfmeltdata$gene!=dfmeltdata$variable,]
dfmeltdata = dfmeltdata[abs(dfmeltdata$value)>cutoff,]
# Define the nodes and edges of the network graph
corweight = dfmeltdata$value
weight = corweight+abs(min(corweight))+5
d = data.frame(p1=dfmeltdata$gene,p2=dfmeltdata$variable,weight=dfmeltdata$value)
g = graph.data.frame(dfmeltdata,directed = FALSE)
# Set the color , Node size , font size
E(g)$color = ifelse(corweight>0,rgb(254/255,67/255,101/255,abs(corweight)),rgb(0/255,0/255,255/255,abs(corweight)))
V(g)$size = 8
V(g)$shape = "circle"
V(g)$lable.cex = 1.2
V(g)$color = "white"
E(g)$weight = weight
# visualization
layout(matrix(c(1,1,1,0,2,0),byrow=T,nc=3),height=c(6,1),width=c(3,4,3))
par(mar=c(1.5,2,2,2))
vertex.frame.color = NA
plot(g,layout=layout_nicely,vertex.label.cex=V(g)$lable.cex,edge.width = E(g)$weight,edge.arrow.size=0,vertex.label.color="black",vertex.frame.color=vertex.frame.color,edge.color=E(g)$color,vertex.label.cex=V(g)$lable.cex,vertex.label.font=2,vertex.size=V(g)$size,edge.curved=0.4)
# Draw a legend
color_legend = c(rgb(254/255,67/255,101/255,seq(1,0,by=-0.01)),rgb(0/255,0/255,255/255,seq(0,1,by=0.01)))
par(mar=c(2,2,1,2),xpd = T,cex.axis=1.6,las=1)
barplot(rep(1,length(color_legend)),border = NA, space = 0,ylab="",xlab="",xlim=c(1,length(color_legend)),horiz=FALSE,
axes = F, col=color_legend,main="")
axis(3,at=seq(1,length(color_legend),length=5),c(1,0.5,0,-0.5,-1),tick=FALSE)
Each node in the figure below represents a gene, If two gene There's a connection between them , Then they have a co expressive relationship . The positive correlation is shown in red , The negative correlation is shown in blue . The darker the edges , It means two gene The greater the correlation coefficient between .

边栏推荐
猜你喜欢

Case analysis of terminal data leakage prevention

MFC TabCtrl 控件修改标签尺寸

JTAG interface

MySQL basic interview questions

System identification of automatic control principle

Matlab system identification

性能优化 之 3D资产优化及顶点数据管理

Ethernet UDP frame contract design

MYSQL牛客刷题

Single cell paper record (Part11) -- clustermap for multi-scale clustering analysis of spatial gene expression
随机推荐
五大常考SQL面试题
常用CMOS模拟开关功能和原理
TiDB 社区线下交流会,天津 & 石家庄的小伙伴看过来~
I2C接口
EMC的解决
Unity app提高设备可用性
[Examen des points clés de l'informatique en nuage]
小熊派BearPi-HM Micro正式合入OpenHarmony主干
drop、truncate和delete的区别
从入门到精通之专家系统CLIPS(一)CLIPS初识与概述
单精度,双精度和精度(转载)
postgresql数据库中根据某个字段判断存在则更新(update)操作,不存在则插入(insert)
基于断言的验证
Logback自定义Pattern参数解析
400 hash table (1. sum of two numbers, 454. sum of four numbers II, 383. ransom letter)
Grabcut analysis
Assertion based validation
单细胞论文记录(part8)--Cell2location maps fine-grained cell types in spatial transcriptomics
相干声呐GeoSwath的综述
401-字符串(344. 反转字符串、541. 反转字符串II、题目:剑指Offer 05.替换空格、151. 颠倒字符串中的单词)