GAMETES数据模拟软件

User Guide

Model Generation 可以生成特定的两位点模型(heritability 0.2, MAF 0.2)
可以设定的参数有:number of attributes, heritability, MAF, population prevalence, the name of SNP, the model difficulty metric used to identify quantiles, the quantile count, quantile population size

Number of attributes
模型中SNP位点的个数
Heritability
遗传率,个体间可观察到的差异是由于基因差异造成的,如果heritability的数值过高,那么GAMETES就不能生成随机模型了
Population prevalence(K)
人群中所患病的比例,0和1之间,建议将K值设为默认
MAF
最小等位基因频率, 0.05到0.5
Quantiles
选择EDM或者Odds Ratio
Quantile Count
保存到模型输出文件中模型架构的数量
Quantile Population Size
根据给定的模型约束条件你希望GAMETES创建的随机模型架构数目
Quantile Count <= Quantile Population Size

Create Models
Quantile Count = 1
允许修改三个参数:the number of attributes, penetrance(外显率,是指个体具有特定基因型组合的疾病的概率), MAF
Marginal penetrances
只考虑单个SNP位点时,每个基因型患病的概率

penetrance为0到1之间
the penetrance values of the penetrance function
Attributes themselves are labeled by default as P1,P2,or P3 for each ‘predictive’ SNP in the model.
Models where all penetrance values are either 0 or 1
GAMETES数据模拟软件
一般来说,实际情况会是这样,不过我们在生成数据的时候不好控制里面的参数(prevalence),使边缘外显率一致,因此我们一般设置为0和1
每个边缘外显率的计算方法就是,假设固定SNP1为AA,那么我们就只看第一行SNP2数据,将AABB、AABb、AAbb的prevalence(0.266、0.764、0.664)分别乘以SNP2的概率,然后相加就变成了0.25×0.266+0.5×0.764+0.25×0.664=0.614

Datasets simulated using GAMETES have two types of attributes/SNPs, (1) predictive attributes, and (2)non-predictive attributes. Predictive attributes are those specied in the genetic model. Non-predictive attributes are all other attributes which have no specied association with aection status (i.e. case or control).
E.g.If you have a 3-locus model, and you specify 100 total attributes, 97 will be non-predictive.
Notice that by default, the datasets are saved such that the first columns include non-predictive attributes, followed by predictive attributes, and lastly the class status. Non-predictive attributes are labeled in the dataset with the prex ‘N’ (e.g. N38). Predictive attribute labels begin with a simple model identier (useful when modeling heterogeneity), with the prex ‘M’ for model, and then the prex ‘P’ for predictive attribute (e.g.M0P4).(要想生成混合模型,heterogeneity参数的和为1)

Discovering causal interactions using Bayesian network scoring and information gain
When variables combine to affect a target with no marginal effect (e.g. pure, strict epistasis), we definitely can say there is an interaction.

However, in general, there does not appear to be a dichotomous way to classify a discrete causal relationship as an interaction or a non-interaction. So, we propose a fuzzy set membership definition of a discrete interaction in the Methods Section.

GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures
We classify models of statistical epistasis into pure and impure, as well as strict and nested subtypes.
Pure refers to epistasis between n loci that do not display any main effects .
Strict refers to epistasis where nloci are predictive of phenotype but no proper multi-locus subset of them are.

0,1,2 的意义
The essential idea is to code binary SNPs not as integers, but as bits
or the number of second allele at each locus
0 represents homozygous common genotype, 1 represents heterozygous genotype, and 2 represents homozygous minor genotype
A表示常见等位基因,a表示少见等位基因。

全基因组单核苷酸多态性交互作用研究 尚军亮
INME(SNP inteaction displaying no marginal effects) 是指单个SNP对表型并不产生影响,但是当两个或两个以上的SNP联合起来就会共同表现出较强的效应。

GAMETES数据模拟软件
GAMETES数据模拟软件
GAMETES数据模拟软件

下图中的K值(prevalence患病率)为0.614,边际外显率为0.614,遗传力为0.5,边际效应为0
GAMETES数据模拟软件

遗传率不可能等于1,患病率不可能等于0.25,
不可能两个MAF都为0.25
遗传率等于1的情况只有MAF为0.5或者0.29的时候

LD
GAMETES数据模拟软件
GAMETES数据模拟软件

IME和INME的区别
GAMETES数据模拟软件

eME模型的计算方法——EpiSIM的Manul手册
GAMETES数据模拟软件
GAMETES数据模拟软件

当AA和Aa患病几率是一样的,那么边缘效应MES = 0,那么就是have no main effect,没有主效应,也就是只有当aa才会患病,也就是Recessive disease model,隐性疾病模型

叶道军
疾病模型可以分为三种
GAMETES数据模拟软件
GAMETES数据模拟软件
GAMETES数据模拟软件
Recessive disease model 就是没有主效应或者说没有边缘效应的模型,AA和Aa的患病率是一样的,只有当出现aa才患病,MES = 0(通俗点来说就是AA和Aa的患病率得一样,这个例子只是特殊情况),主要出现在简单疾病上,而且这方面相对来说已经研究的比较透彻,因此研究具有边缘效应的模型是非常有必要的

GAMETES数据模拟软件

Model1的单个SNP对疾病都有影响,影响相同,比如第一行和第一列,两个SNP的MES相同。
Model1
GAMETES数据模拟软件

Model2和Model3单个SNP都是没有边缘效应的,从第一行和第一列可以看出,单个SNP的MES都为0。
GAMETES数据模拟软件
GAMETES数据模拟软件
Additive Model 的MES是一样的,但不代表都为0.

关于MES
对于某个SNP位点的Marginal effect size,如果该位点在Aa上患病风险大,而在AA上患病风险小,那么该位点的MES就大,也就说明该位点对疾病的主效应很大,反之则主效应很小