title: "BiCausality: Binary Causality Inference Framework" author: " C. Amornbunchornvej" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{BiCausality_demo} %\VignetteEngine{knitr::knitr} \usepackage[utf8]{inputenc} --- Example: Inferred binary causal graph from simulation ---------------------------------------------------------------------------------- In the first step, we generate a simulation dataset as an input. ```{r} seedN<-2022 n<-200 # 200 individuals d<-10 # 10 variables mat<-matrix(nrow=n,ncol=d) # the input of framework #Simulate binary data from binomial distribution where the probability of value being 1 is 0.5. for(i in seq(n)) { set.seed(seedN+i) mat[i,] <- rbinom(n=d, size=1, prob=0.5) } mat[,1]<-mat[,2] | mat[,3] # 1 causes by 2 and 3 mat[,4] <-mat[,2] | mat[,5] # 4 causses by 2 and 5 mat[,6] <- mat[,1] | mat[,4] # 6 causes by 1 and 4 ``` We use the following function to infer whether X causes Y. ``` {r} # Run the function library(BiCausality) resC<-BiCausality::CausalGraphInferMainFunc(mat = mat,CausalThs=0.1, nboot =50, IndpThs=0.05) ``` The result of the adjacency matrix of the directed causal graph is below: ```{r} resC$CausalGRes$Ehat ``` The value in the element EValHat[i,j] represents that i causes j if the value is not zero. For example, EValHat[2,1] = 1 implies node 2 causes node 1, which is correct since node 1 have nodes 2 and 3 as causal nodes. The directed causal graph also can be plot using the code below. ```{r} library(igraph) net <- graph_from_adjacency_matrix(resC$CausalGRes$Ehat ,weighted = NULL) plot(net, edge.arrow.size = 0.3, vertex.size =20 , vertex.color = '#D4C8E9',layout=layout_with_kk) ``` For the causal relation of variables 2 and 1, we can use the command below to see further information. **Note that the odd difference between X and Y denoted oddDiff(X,Y) is define as |P (X = 1, Y = 1) P (X = 0, Y = 0) −P (X = 0, Y = 1) P (X = 1, Y = 0)|. If X is directly proportional to Y, then oddDiff(X,Y) is close to 1. If X is inverse of Y, then oddDiff(X,Y) is close to -1. If X and Y have no association, then oddDiff(X,Y) is close to zero. ```{r} resC$CausalGRes$causalInfo[['2,1']] ``` Below are the details of result explanation. ``` #This value represents the 95th percentile confidence interval of P(Y=1|X=1). $CDirConfValInv 2.5% 97.5% 1 1 #This value represents the 95th percentile confidence interval of |P(Y=1|X=1) - P(X=1|Y=1)|. $CDirConfInv 2.5% 97.5% 0.3217322 0.4534494 #This value represents the mean of |P(Y=1|X=1) - P(X=1|Y=1)|. $CDirmean [1] 0.3787904 #The test that has the null hypothesis that |P(Y=1|X=1) - P(X=1|Y=1)| below #or equal the argument of parameter "CausalThs" and the alternative hypothesis #is that |P(Y=1|X=1) - P(X=1|Y=1)| is greater than "CausalThs". $testRes2 Wilcoxon signed rank test with continuity correction data: abs(bCausalDirDist) V = 1275, p-value = 3.893e-10 alternative hypothesis: true location is greater than 0.1 #The test that has the null hypothesis that |oddDiff(X,Y)| below #or equal the argument of parameter "IndpThs" and the alternative hypothesis is #that |oddDiff(X,Y)| is greater than "IndpThs". $testRes1 Wilcoxon signed rank test with continuity correction data: abs(bSignDist) V = 1275, p-value = 3.894e-10 alternative hypothesis: true location is greater than 0.05 #If the test above rejects the null hypothesis with the significance threshold #alpha (default alpha=0.05), then the value "sign=1", otherwise, it is zero. $sign [1] 1 #This value represents the 95th percentile confidence interval of oddDiff(X,Y) $SignConfInv 2.5% 97.5% 0.08670325 0.13693900 #This value represents the mean of oddDiff(X,Y) $Signmean [1] 0.1082242 ```