当前位置:网站首页>Shengxin visualization (Part4) -- correlation diagram

Shengxin visualization (Part4) -- correlation diagram

2022-06-22 06:06:00 GoatGui

Learning notes , For reference only , If there is a mistake, it must be corrected



Shengxin visualization

Correlation scatter

The correlation scatter can show two gene The correlation between .

Input data format :
 Insert picture description here
among , Row representation gene, Tabulated sample .

Code :

library(ggplot2)
library(ggpubr)
library(ggExtra)

inputFile="./22.cor/input.txt"      
gene1="MTMR14"             # The first gene name 
gene2="PRKCD"              # The second gene name 

# Read input file , Extract gene expression amount 
rt=read.table(inputFile,sep="\t",header=T,check.names=F,row.names=1)
x=as.numeric(rt[gene1,])
y=as.numeric(rt[gene2,])

# correlation analysis 
df1=as.data.frame(cbind(x,y))
corT=cor.test(x,y,method="spearman")
cor=corT$estimate
pValue=corT$p.value
ggplot(df1, aes(x, y)) + 
			xlab(gene1)+ylab(gene2)+
			geom_point()+ geom_smooth(method="lm",formula = y ~ x) + theme_bw()+
			stat_cor(method = 'spearman', aes(x =x, y =y))

The following figure ,x Axis representation MTMR14 Gene expression ,y Axis representation PRKCD Gene expression , The blue line in the figure represents the regression line . At the top left R R R Indicates the correlation coefficient , p p p Is the correlation coefficient P value .

 Insert picture description here

Correlation heat map

A correlation heat map can show multiple... In one graph gene The correlation between .

Input data format :
 Insert picture description here

among , Row representation gene, Tabulated sample .

Code :

library(corrplot)
inputFile="./23.corrplot/input.txt"       

rt=read.table(inputFile,sep="\t",header=T,row.names=1)      # Read the file 
rt=t(rt)      # Data transpose 
M=cor(rt)     # Correlation matrix 

# Draw correlation graph 
corrplot(M,
         method = "circle",
         order = "hclust", # clustering 
         type = "upper",
         col=colorRampPalette(c("green", "white", "red"))(50)
         )

The following figure , The redder the circle, the two gene The correlation coefficient between them tends to 1, The greener the circle, the two gene The correlation coefficient between them tends to -1.
 Insert picture description here

Correlation network diagram

Input data format :
 Insert picture description here

among , Row representation gene, Tabulated sample .

Code :

library(igraph)
library(reshape2)

inputFile="./25.corNetwork/input.txt"
cutoff=0.4  # Correlation threshold 

# Read input file 
data=read.table(inputFile,header=T,sep="\t",row.names=1,check.names=F)
cordata=cor(t(data))

# Keep half of the correlation matrix 
mydata = cordata
upper = upper.tri(mydata)
mydata[upper] = NA

# Transform the correlation matrix into a data frame 
df = data.frame(gene=rownames(mydata),mydata)
dfmeltdata = melt(df,id="gene")
dfmeltdata = dfmeltdata[!is.na(dfmeltdata$value),]
dfmeltdata = dfmeltdata[dfmeltdata$gene!=dfmeltdata$variable,]
dfmeltdata = dfmeltdata[abs(dfmeltdata$value)>cutoff,]

# Define the nodes and edges of the network graph 
corweight = dfmeltdata$value
weight = corweight+abs(min(corweight))+5
d = data.frame(p1=dfmeltdata$gene,p2=dfmeltdata$variable,weight=dfmeltdata$value)
g = graph.data.frame(dfmeltdata,directed = FALSE)

# Set the color , Node size , font size 
E(g)$color = ifelse(corweight>0,rgb(254/255,67/255,101/255,abs(corweight)),rgb(0/255,0/255,255/255,abs(corweight)))
V(g)$size = 8
V(g)$shape = "circle"
V(g)$lable.cex = 1.2
V(g)$color = "white"
E(g)$weight = weight

# visualization 
layout(matrix(c(1,1,1,0,2,0),byrow=T,nc=3),height=c(6,1),width=c(3,4,3))
par(mar=c(1.5,2,2,2))
vertex.frame.color = NA
plot(g,layout=layout_nicely,vertex.label.cex=V(g)$lable.cex,edge.width = E(g)$weight,edge.arrow.size=0,vertex.label.color="black",vertex.frame.color=vertex.frame.color,edge.color=E(g)$color,vertex.label.cex=V(g)$lable.cex,vertex.label.font=2,vertex.size=V(g)$size,edge.curved=0.4)

# Draw a legend 
color_legend = c(rgb(254/255,67/255,101/255,seq(1,0,by=-0.01)),rgb(0/255,0/255,255/255,seq(0,1,by=0.01)))
par(mar=c(2,2,1,2),xpd = T,cex.axis=1.6,las=1)
barplot(rep(1,length(color_legend)),border = NA, space = 0,ylab="",xlab="",xlim=c(1,length(color_legend)),horiz=FALSE,
        axes = F, col=color_legend,main="")
axis(3,at=seq(1,length(color_legend),length=5),c(1,0.5,0,-0.5,-1),tick=FALSE)

Each node in the figure below represents a gene, If two gene There's a connection between them , Then they have a co expressive relationship . The positive correlation is shown in red , The negative correlation is shown in blue . The darker the edges , It means two gene The greater the correlation coefficient between .

 Insert picture description here

原网站

版权声明
本文为[GoatGui]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206220544496591.html