 PROCEEDINGS
 Open Access
A network based covariance test for detecting multivariate eQTL in saccharomyces cerevisiae
 Huili Yuan^{1},
 Zhenye Li^{1},
 Nelson L.S. Tang^{2} and
 Minghua Deng^{1, 3, 4}Email author
https://doi.org/10.1186/s1291801502450
© Yuan et al. 2016
 Published: 11 January 2016
Abstract
Background
Expression quantitative trait locus (eQTL) analysis has been widely used to understand how genetic variations affect gene expressions in the biological systems. Traditional eQTL is investigated in a pairwise manner in which one SNP affects the expression of one gene. In this way, some associated markers found in GWAS have been related to disease mechanism by eQTL study. However, in real life, biological process is usually performed by a group of genes. Although some methods have been proposed to identify a group of SNPs that affect the mean of gene expressions in the network, the change of coexpression pattern has not been considered. So we propose a process and algorithm to identify the marker which affects the coexpression pattern of a pathway. Considering two genes may have different correlations under different isoforms which is hard to detect by the linear test, we also consider the nonlinear test.
Results
When we applied our method to yeast eQTL dataset profiled under both the glucose and ethanol conditions, we identified a total of 166 modules, with each module consisting of a group of genes and one eQTL where the eQTL regulate the coexpression patterns of the group of genes. We found that many of these modules have biological significance.
Conclusions
We propose a network based covariance test to identify the SNP which affects the structure of a pathway. We also consider the nonlinear test as considering two genes may have different correlations under different isoforms which is hard to detect by linear test.
Keywords
 eQTL
 Pathway
 Isoform
Background
GWAS aims to detect the association between genetic variation and complex diseases. Recent years, GWAS has found 2000 loci associated to complex diseases by statistical methods [1]. As the development of the nextgeneration sequencing and other highthroughput technology, various types of genomescale datasets have been collected, providing opportunity to find the mechanism of genetic variation leading to complex diseases by connect the highthroughout data to GWAS. The eQTL study is one of them, which aims to uncover the genetic effects to gene expression and have been conducted in many organisms [2]–[5]. A common approach in eQTL data analysis is to consider association between each expression trait and each genetic marker through regression analysis. Despite great success with this approach, some regulatory signals may not be detected due to complex interaction between SNPs like epistasis.
Although most eQTL studies considered the expression levels of individual genes as response (single outcome variable), the change of correlation between genes under different genetic status still contains some biological information. For example, posttranscriptional regulations such as phosphorylations and dephosphorylations often affect the activities of transcriptional factors (TFs), which further affect the correlation among TF genes and TF target genes, also the coexpression patterns of the targets of TFs. However, such regulations are hard to be detected if only individual gene considered because there may be little change at the expression levels of individual TF genes. The approach considering “liquid association” (LA) between a pair of genes proposed by [6] is a method to identify such loci, which is later introduced into eQTL study [7]. Subsequently, conditional bivariate normal model has been developed to capture the change of correlation between a pair of genes [8]–[10].
However, a biological process is usually performed by a group of genes (more than two genes as in the bivariate model). Network approaches should be used to study these interactions [11]–[13]. If we want to see the effects of a cellular change to the organism, it is better for us to consider the change in a functional geneset such as a pathway. Therefore, some papers has considered the multivariate circumstances by applying CCA to gene expressions and SNP (or CNV) data [14]–[16]. However, these methods do not consider the network structure when finding the association between gene sets and genetic variant, which will miss the information contained in the network. Li et al. [17], Kim and Xing [18], Zhang and Kim [19], Casale et al. [20] have considered pathway structure when studied the association between genetic variation and gene expression. However, they assume the network structure is the same (static) under different genetic variant. In fact, network structure may be dynamic and biologists have realized that differential network analysis will become a standard mode in network analysis and insightful discoveries could be made with differential network analysis [21]. For example, [22] identified a cancer point mutation in the kinase domain of RET, which causes multiple endocrine neoplasia type 2B by leading to a switch in peptide specificity and then altering the network structure.
So we propose a method to test whether the coexpression pattern in a pathway is affected by a SNP. Our goal is to test for a global change in covariance structure in each pathway, which is different from other networkbased methods, which tries to detect nonzero edges from all pairs of genes. When we applied our method to a yeast eQTL dataset, we were able to find some pathwaySNP modules that have biological significance.
Methods
Model
We use covariance test to find the pathwaySNP modules. There are three key elements of covariance test for a given gene set S. We consider the strategy similar to [26].

• Calculation of T statistics. We calculate a T statistics that reflects the difference of the covariance matrix of the two classes of samples. The statistics is calculated by estimating the Frobenius norm of the difference of the covariance matrix. We first perform the method by [27] to do the test:${H}_{0}:{\Sigma}_{1}={\Sigma}_{2},\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}{H}_{1}:{\Sigma}_{1}\ne {\Sigma}_{2}$(1)
where Σ _{1} is the covariance matrix of gene expression under one genotype and Σ _{2} is that of gene expression under the other genotype. Then we consider the nonlinear relationship between gene expressions by applying kernel method.

• Estimation of significance level of T statistics. We estimate the statistical significance (nominal P value) of the T statistics by using an empirical SNPbased permutation test procedure that preserves the complex correlation structure of the gene expression data. Specifically, we permute the SNP labels and recompute the T statistics of the gene set for the permuted data, which generates a null distribution for the T statistics. The empirical, nominal P value of the observed T statistics is then calculated relative to this null distribution. Importantly, the permutation of class labels preserves genegene correlations and, thus, provides a more biologically reasonable assessment of significance than would be obtained by permuting genes.

• Adjustment for multiple hypothesis testing. When an entire database of gene sets is evaluated, we adjust the estimated significance level to account for multiple hypothesis testing. We first normalize the T statistics for each gene set to account for the size of the set, yielding a normalized T statistics. We then control the proportion of false positives by calculating the false discovery rate (FDR) corresponding to each NT statistics. The FDR is the estimated probability that a set with a given NT statistics represents a false positive finding; it is computed by comparing the tails of the observed and null distributions for the NT statistics. To capture the change of the structure of the gene network, we consider the covariance of the gene expression.
Test for highdimensional covariance matrices
where h refers to a subpopulation with a particular SNP.
where ${T}_{{n}_{2},{n}_{3}}$ is defined similar to ${T}_{{n}_{1},{n}_{2}}$.
Kernel method
We generalize the method of [27] to the kernel space inspired by the method of [28]. We give the similar definition of Frobenius norm and covariance matrix. Let p _{ x } and p _{ y } be Borel probability measures defined on a domain Ω. Given observations X:={x _{1},…,x _{ m } } and Y:={y _{1},…,y _{ n } }, drawn independently and identically distributed(i.i.d.) from p _{ x } and p _{ y } , respectively.
For test
H _{0}:Σ _{ xx } =Σ _{ yy } =Σ _{ zz } ,H _{1}:Σ _{ xx } ≠Σ _{ yy } or Σ _{ yy } ≠Σ _{ zz }
We consider $\parallel {\Sigma}_{\mathit{\text{xx}}}{\Sigma}_{\mathit{\text{yy}}}{\parallel}_{\mathit{\text{HS}}}^{2}+\parallel {\Sigma}_{\mathit{\text{yy}}}{\Sigma}_{\mathit{\text{zz}}}{\parallel}_{\mathit{\text{HS}}}^{2}$.
where ${T}_{{n}_{2},{n}_{3}}$ is defined similar to ${T}_{{n}_{1},{n}_{2}}$.
Results
Simulation
Comparison between linear method and kernel method
We performed a simulation study to evaluate the power of the proposed kernel methods, and compared the results with the primary method by [27]. Three models have been considered, as below.
Model 1: X _{ ijk } =Z _{ ijk } + θ Z _{ i } j k_{+1}, where Z _{ ijk } were i.i.d. standard normally distributed, and θ=0.5 in the null hypothesis while 0.2 or 0.3 in the alternative hypothesis.
Model 2: ${X}_{\mathit{\text{ijk}}}={Z}_{\mathit{\text{ijk}}}^{3}+\theta {Z}_{\mathit{\text{ijk}}+1}^{3}$, where Z _{ ijk } and θ were defined the same as that in Model 1.
Model 3: ${X}_{\mathit{\text{ijk}}}={e}^{{Z}_{\mathit{\text{ijk}}}}+\theta {e}^{{Z}_{\mathit{\text{ijk}}+1}}$, where Z _{ ijk } and θ were defined the same as that in Model 1.
The correlation between variables are linear in model 1, while the correlation between variables are nonlinear in model 2 and 3.
Comparison between Chen et al.’s linear method and other method
We conducted a simulation to compare the power of Chen et al.’s method [27] and Tony Cai et al.’s method [29]. We consider four simulation setups represented different signal quantities and strength, the first of which is the same as the model 2 in [29].
U=(u _{ kl } ) be a matrix with eight random nonzero entries, each with a magnitude generated from U n i f(0,4)∗ max1≤j≤p σ _{ jj } . The number of each class samples is 50 and the number of variables is 50.
Model 2: U=(u _{ kl } ) be a matrix with eight random nonzero entries, each with a magnitude generated from U n i f(0,400)∗ max1≤j≤p σ _{ jj } .
Model 3: U=(u _{ kl } ) be a matrix with 500 random nonzero entries, each with a magnitude generated from U n i f(0,4)∗ max1≤j≤p σ _{ jj } .
Model 4: U=(u _{ kl } ) be a matrix with 500 random nonzero entries, each with a magnitude generated from U n i f(0,400)∗ max1≤j≤p σ _{ jj } .
Real data results
Associated SNP and pathways
We analyzed the yeast dataset collected by Kruglyak and colleagues [30]. The expression data were downloaded from http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0060083, with 4482 genes measured in 109 segregants derived from a cross between BY and RM. The experiments were performed under two conditions, glucose and ethanol. We did the preprocessing like [10], after which 4419 genes and 820 merged markers remained. We mapped 4419 genes to 103 pathways and analyzed the effect of each SNP to each pathway. Therefore, we tested 103*820 times. The algorithm was implemented in R, which can be found at http://www.math.pku.edu.cn/teachers/dengmh/NetworkBiomarker.
New associated pathways and SNPs under ethanol condition
Pathways  Associated markers 

Glycolysis/Gluconeogenesis  g O L02^{(10)} 
Synthesis and degradation of ketone bodies  Y L R257W ^{(10)},Y L R261C ^{(10)} 
Steroid biosynthesis  Y E L021W ^{ L } , Y F R035C ^{ L } , 
Y J L001W ^{ L } , Y J R006W ^{ L }^{,(10)},  
Y J L007C ^{ L }^{,(10)},  
Valine, leucine and isoleucine degradation  Y O R006C ^{ L } , N O R005W ^{ L }^{,(10)}, 
Y O R051C ^{ L }^{,(10)}, Y O R076C ^{ L }^{,(10)}  
Valine, leucine and isoleucine  N L R116W ^{ L }^{,(10)}, Y O R076C ^{ L } , 
biosynthesis  Y C L023C ^{(10)}, Y L R257W ^{(10)} 
Histidine metabolism  g O L02^{ L }^{,(10)}, Y O R025W ^{(10)} 
Tyrosine metabolism  Y F L019C ^{ L } , g O L02^{ L }^{,(10)} 
Phenylalanine metabolism  g O L02^{ L }^{,(10)} 
betaAlanine metabolism  g O L02^{ L }^{,(10)} 
Taurine and hypotaurine metabolism  g O L02^{(10)} 
Selenocompound metabolism  Y O R006C ^{ L }^{,(10)}, Y O R019W ^{ L }^{,(10)}, 
Y O R025W ^{ L }^{,(10)}, N O R005W ^{ L }^{,(10)}  
Purine metabolism  N N L035W ^{(1)} 
Cyanoamino acid metabolism  Y L R027C ^{(1)} 
Arachidonic acid metabolism  g P L09^{(1)} 
Linoleic acid metabolism  Y F L029C ^{ L }^{,(1),(10)}, Y F L019C ^{(1)} 
Glyoxylate and dicarboxylate metabolism  N N L035W ^{(1)}, Y N L074C ^{(1)} 
Porphyrin and chlorophyll metabolism  N B R008W ^{(1)} 
Sphingolipid metabolism  Y H L047C ^{ L } 
Pantothenate and CoA biosynthesis  Y G L053W ^{ L } , N L R116W ^{ L }^{,(10)} 
Terpenoid backbone biosynthesis  Y J L007C ^{ L }^{,(10)}, Y J L001W ^{ L }^{,(10)}, 
Y J R006W ^{ L }^{,(10)}, N J R006C ^{ L }^{,(10)}  
Sesquiterpenoid and triterpenoid  Y O R334W ^{ L } , Y O R343C ^{ L } , 
biosynthesis  
Y L R261C ^{(10)}, N L R116W ^{(10)},  
Y L R257W ^{(10)}  
Metabolic pathways  Y I L078W ^{(10)}, Y L R257W ^{(10)}, 
Y L R308W ^{(10)}, N N L035W ^{(10)},  
g O L02^{(10)}, Y O R006C ^{(10)},  
Y O R051C ^{(10)},Y O R019W ^{(10)},  
Y N L066W ^{(10)},Y L R261C ^{(10)}  
Biosynthesis of secondary metabolites  Y O R025W ^{(10)}, Y O R063W ^{(10)} 
Carbon metabolism  Y O R019W ^{(10)} 
2Oxocarboxylic acid metabolism  Y L R261C ^{ L }^{,(10)}, Y L R308W ^{ L }^{,(10)}, 
Y C L022C ^{(10)}, Y L R265C ^{(10)},  
N L R116W ^{(10)},Y L R322W ^{ L }  
mRNA surveillance pathway  Y O R072W ^{ L } 
Mismatch repair  g K R08^{ L } 
Nonhomologous end  Y G R006W ^{ L } 
Biosynthesis of amino acids  g O L02^{(10)} 
MAPK signaling pathway  Y D R164C ^{(10)}, g D R10^{(10)} 
New associated pathways and SNPs under glucose condition
Pathways  Associated markers 

Synthesis and degradation of ketone bodies  g J L07^{(10)} 
Inositol phosphate metabolism  Y B R259W ^{(10)} 
Riboflavin metabolism  Y M L056C ^{(10)} 
Fatty acid degradation  Y B R045C ^{(1)} 
Cysteine and methionine metabolism  Y G L195W ^{(1)} 
Valine, leucine and isoleucine biosynthesis  Y C L025C ^{ L } ,N G R093C ^{ L } 
Y O R253W ^{ L } ,Y O R274W ^{ L }  
Y O R326W ^{ L } ,Y O R334W ^{ L }  
Y O R343C ^{ L } ,Y C L022C ^{(1)}  
Phenylalanine metabolism  Y J R040W ^{ L } ,Y O L123W ^{ L } 
Y O L118C ^{ L } ,Y O L106W ^{ L }  
Y O L093W ^{ L } ,Y O L088C ^{ L }  
g O L02^{ L }^{,(1)}  
betaAlanine metabolism  Y B R271W ^{ L } ,g O L02^{ L }^{,(1)} 
N J R007C ^{ L } ,Y O L106W ^{ L }  
Arachidonic acid metabolism  Y I R022W ^{ L }^{,(1)} 
Vitamin B6 metabolism  Y K L118W ^{(1)} 
Porphyrin and chlorophyll metabolism  Y M L071C ^{(1)},g F L02^{ L } 
Degradation of aromatic compounds  Y M R316C ^{ L }^{,(1)},Y M R316C ^{ L } 
ABC transporters  Y B R131W ^{ L }^{,(1)},Y B R137W ^{ L } 
Glycolysis/Gluconeogenesis  Y J R071W ^{ L } 
Pentose phosphate pathway  N O L043W ^{ L } , Y O L151W ^{ L } 
Y O L123W ^{ L } , Y O L094C ^{ L }  
Y O L093W ^{ L } , Y O L088C ^{ L }  
g O L02^{ L }  
Pentose and glucuronate interconversions  Y G L263W ^{ L } 
Purine metabolism  Y L R140W ^{ L } 
Pyrimidine metabolism  Y B L010C ^{ L } , Y G L217C ^{ L } 
Glycine, serine and threonine metabolism  Y C L065W ^{ L } , Y J R038C ^{ L } 
Lysine biosynthesis  Y B R087W ^{ L } 
Histidine metabolism  Y B R271W ^{ L } , N J R007C ^{ L } 
Y J R040W ^{ L } , Y J R057W ^{ L }  
Y O L106W ^{ L } , Y O L093W ^{ L } ,  
g O L02^{ L }^{,(1)}  
Tyrosine metabolism  Y O L123W ^{ L } , Y O L106W ^{ L } 
Y O L094C ^{ L } , g O L02^{ L }^{,(1)}  
Cyanoamino acid metabolism  Y D R351W ^{ L } 
Starch and sucrose metabolism  Y E R095W ^{ L } ,Y E R116C ^{ L } 
Linoleic acid metabolism  N D R174C ^{ L } 
Butanoate metabolism  Y B R271W ^{ L } 
Pantothenate and CoA biosynthesis  Y O R274W ^{ L } 
Lipoic acid metabolism  g L L01^{ L } ,Y N L158W ^{ L } 
Folate biosynthesis  N M L013W ^{ L } ,Y N L066W ^{ L } ,Y N L050C ^{ L } 
Sesquiterpenoid and triterpenoid biosynthesis  Y M R084W ^{ L } 
AminoacyltRNA biosynthesis  Y C L065W ^{ L } ,Y C L047C ^{ L } 
Y C L039W ^{ L } ,N J R007C ^{ L } ,Y N L010W ^{ L }  
Biosynthesis of unsaturated fatty acids  Y F L029C ^{ L } 
Metabolic pathways  Y C L065W ^{ L } ,Y J R071W ^{ L } 
Biosynthesis of secondary metabolites  Y J R038C6L 
Biosynthesis of amino acids  Y J R071W ^{ L } ,Y O L123W ^{ L } 
Y O L118C ^{ L } ,Y O L106W ^{ L }  
Y O L094C ^{ L } ,Y O L093W ^{ L }  
Y O L088C ^{ L } ,g O L02^{ L }  
Ribosome  Y A R035W ^{ L } ,Y J L026W ^{ L } 
RNA transport  Y B L010C ^{ L } 
RNA polymerase  Y L R140W ^{ L } 
Proteasome  Y B L010C ^{ L } 
Phosphatidylinositol signaling system  Y B R045C ^{ L } 
Meiosis  yeast  Y O L106W ^{ L } 
Kernel Method found isoformspecific structure change
Linoleic acid metabolism is associated to cell cycle
Discussion and conclusion
We propose a network based covariance test to identify the marker which affects the structure of a pathway. It has an advantage that a static network structure is not assumed. The biomarker we defined is the SNP associated to the structure of genes in the pathway. Considering two genes may have different correlations under different isoforms which is hard to detect by linear test, so we also consider the nonlinear test. We identified a total of 166 modules, with each module consisting of a group of genes and one eQTL where the eQTL regulate the coexpression patterns of the group of genes. We found that many of these modules have biological interpretations. Till now, we consider the difference of two networks by covariance matrix and covariance operators. We will focus on difference of precision matrix in the future research.
Declarations
This article has been published as part of BMC Systems Biology Volume 10 Supplement 1, 2016: Selected articles from the Fourteenth Asia Pacific Bioinformatics Conference (APBC 2016): Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/10/S1.
Additional file
Declarations
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Nos. 31171262, 31428012, 31471246), and the National Key Basic Research Project of China (No. 2015CB910303). We thank Dr. Lin Wang for helpful feedback during the saccharomyces cerevisiae data preprocessing and Dr. Minping Qian and Dr. Kai Song’s discussions. The publication costs for this article are from the National Natural Science Foundation of China (No. 31471246).
Authors’ Affiliations
References
 Visscher PM, Brown MA, McCarthy MI, Yang J: Five years of gwas discovery. Am J Hum Genet. 2012, 90 (1): 724. 10.1016/j.ajhg.2011.11.029.View ArticlePubMedPubMed CentralGoogle Scholar
 Albert FW, Kruglyak L: The role of regulatory variation in complex traits and disease. Nat Rev Genet. 2015, 16: 197212.View ArticlePubMedGoogle Scholar
 Cookson W, Liang L, Abecasis G, Moffatt M, Lathrop M: Mapping complex disease traits with global gene expression. Nat Rev Genet. 2009, 10 (3): 18494. 10.1038/nrg2537.View ArticlePubMedPubMed CentralGoogle Scholar
 Nica AC, Dermitzakis ET: Expression quantitative trait loci: present and future. Phil Trans R Soc B: Biol Sci. 2013, 368 (1620): 2012036210.1098/rstb.2012.0362.View ArticleGoogle Scholar
 Wang P, Dawson JA, Keller MP, Yandell BS, Thornberry NA, Zhang BB, et al: A model selection approach for expression quantitative trait loci (eqtl) mapping. Genetics. 2011, 187 (2): 61121. 10.1534/genetics.110.122796.View ArticlePubMedPubMed CentralGoogle Scholar
 Li KC: Genomewide coexpression dynamics: theory and application. Proc Natl Acad Sci. 2002, 99 (26): 1687580. 10.1073/pnas.252466999.View ArticlePubMedPubMed CentralGoogle Scholar
 Sun W, Yuan S, Li KC: Traittrait dynamic interaction: 2dtrait eqtl mapping for genetic variation study. BMC Genomics. 2008, 9 (1): 24210.1186/147121649242.View ArticlePubMedPubMed CentralGoogle Scholar
 Ho YY, Parmigiani G, Louis TA, Cope LM: Modeling liquid association. Biometrics. 2011, 67 (1): 13341. 10.1111/j.15410420.2010.01440.x.View ArticlePubMedGoogle Scholar
 Chen J, Xie J, Li H: A penalized likelihood approach for bivariate conditional normal models for dynamic coexpression analysis. Biometrics. 2011, 67 (1): 299308. 10.1111/j.15410420.2010.01413.x.View ArticlePubMedPubMed CentralGoogle Scholar
 Wang L, Zheng W, Zhao H, Deng M: Statistical analysis reveals coexpression patterns of many pairs of genes in yeast are jointly regulated by interacting loci. PLoS Genet. 2013, 9 (3): 100341410.1371/journal.pgen.1003414.View ArticleGoogle Scholar
 Basso K, Margolin AA, Stolovitzky G, Klein U, DallaFavera R, Califano A: Reverse engineering of regulatory networks in human b cells. Nat Genet. 2005, 37 (4): 38290. 10.1038/ng1532.View ArticlePubMedGoogle Scholar
 Bonneau R, Facciotti MT, Reiss DJ, Schmid AK, Pan M, Kaur A, et al: A predictive model for transcriptional control of physiology in a free living cell. Cell. 2007, 131 (7): 135465. 10.1016/j.cell.2007.10.053.View ArticlePubMedGoogle Scholar
 PereiraLeal JB, Enright AJ, Ouzounis CA: Detection of functional modules from protein interaction networks. Proteins Struct Function Bioinforma. 2004, 54 (1): 4957. 10.1002/prot.10505.View ArticleGoogle Scholar
 Witten DM, Tibshirani R, Hastie T: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics. 2009, 10 (3): 51534.View ArticlePubMedPubMed CentralGoogle Scholar
 Naylor MG, Lin X, Weiss ST, Raby BA, Lange C: Using canonical correlation analysis to discover genetic regulatory variants. PloS ONE. 2010, 5 (5): 1039510.1371/journal.pone.0010395.View ArticleGoogle Scholar
 Lin D, Zhang J, Li J, Calhoun VD, Deng HW, Wang YP: Group sparse canonical correlation analysis for genomic data integration. BMC Bioinformatics. 2013, 14 (1): 24510.1186/1471210514245.View ArticlePubMedPubMed CentralGoogle Scholar
 Li Y, Nan B, Zhu J. Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure. Biometrics. 2015.Google Scholar
 Kim S, Xing EP: Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet. 2009, 5 (8): 100058710.1371/journal.pgen.1000587.View ArticleGoogle Scholar
 Zhang L, Kim S: Learning gene networks under snp perturbations using eqtl datasets. PLoS Comput Biol. 2014, 10 (2): 100342010.1371/journal.pcbi.1003420.View ArticleGoogle Scholar
 Casale FP, Rakitsch B, Lippert C, Stegle O. Efficient set tests for the genetic analysis of correlated traits. Nat Meth. 2015.Google Scholar
 Ideker T, Krogan NJ. Differential network biology. Mol Syst Biol. 2012;8.Google Scholar
 Zhou S, Carraway KL, Eck MJ, Harrison SC, Feldman RA, Mohammadi M, et al: Catalytic specificity of proteintyrosine kinases is critical for selective signalling. Nature. 1995, 373 (6514): 5369. 10.1038/373536a0.View ArticleGoogle Scholar
 Tibshirani R: Regression shrinkage and selection via the lasso. J R Stat Soc. Series B (Methodological). 1996, 58 (1): 26788.Google Scholar
 Fan J, Li R: Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001, 96 (456): 134860. 10.1198/016214501753382273.View ArticleGoogle Scholar
 Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 1999, 27: 2934. 10.1093/nar/27.1.29.View ArticlePubMedPubMed CentralGoogle Scholar
 Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al: Gene set enrichment analysis: a knowledgebased approach for interpreting genomewide expression profiles. Proc Natl Acad Sci of the USA. 2005, 102 (43): 1554550. 10.1073/pnas.0506580102.View ArticleGoogle Scholar
 Li J, Chen SX: Two sample tests for highdimensional covariance matrices. Ann Stat. 2012, 40 (2): 90840. 10.1214/12AOS993.View ArticleGoogle Scholar
 Jain S, Simon HU, Tomita E. Algorithmic Learning Theory. 16th International Conference, ALT 2005, Singapore, October 8–11, 2005, Proceedings. SpringerVerlag Berlin Heidelberg 2005.Google Scholar
 Cai T, Liu W, Xia Y: Twosample covariance matrix testing and support recovery in highdimensional and sparse settings. J Am Stat Assoc. 2013, 108 (501): 26577. 10.1080/01621459.2012.758041.View ArticleGoogle Scholar
 Smith EN, Kruglyak L: Gene–environment interaction in yeast gene expression. PLoS Biol. 2008, 6 (4): 8310.1371/journal.pbio.0060083.View ArticleGoogle Scholar
 Kurat CF, Wolinski H, Petschnigg J, Kaluarachchi S, Andrews B, Natter K, et al: Cdk1/cdc28dependent activation of the major triacylglycerol lipase tgl4 in yeast links lipolysis to cellcycle progression. Mol Cell. 2009, 33 (1): 5363. 10.1016/j.molcel.2008.12.019.View ArticlePubMedGoogle Scholar
 Ubersax JA, Woodbury EL, Quang PN, Paraz M, Blethrow JD, Shah K, et al: Targets of the cyclindependent kinase cdk1. Nature. 2003, 425 (6960): 85964. 10.1038/nature02062.View ArticlePubMedGoogle Scholar
 Zhang DY, Dorsey MJ, Voth WP, Carson DJ, Zeng X, Stillman DJ, et al: Intramolecular interaction of yeast tfiib in transcription control. Nucleic Acids Res. 2000, 28 (9): 191320. 10.1093/nar/28.9.1913.View ArticlePubMedPubMed CentralGoogle Scholar
 Venters BJ, Wachi S, Mavrich TN, Andersen BE, Jena P, Sinnamon AJ, et al: A comprehensive genomic binding map of gene and chromatin regulatory proteins in saccharomyces. Mol Cell. 2011, 41 (4): 48092. 10.1016/j.molcel.2011.01.015.View ArticlePubMedPubMed CentralGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.