Input (The MQLS_input file is a modified version of the input.txt file from the CC-QLS software package.) 1. marker data file (e.g. 'markid') This file contains the marker data and binary phenotype information. It has the standard linkage format : 1 1 7 6 1 1 1 2 1 2 1 2 7 6 2 2 1 1 1 2 1 3 7 6 1 2 3 1 1 2 1 7 0 0 1 1 1 1 1 1 1 6 0 0 2 0 2 3 2 2 2 1 8 9 2 2 1 3 1 1 2 2 8 9 1 1 2 3 1 1 2 8 0 0 1 0 1 2 1 2 2 9 5 6 2 1 3 1 1 1 2 5 0 0 1 0 3 2 1 2 2 6 0 0 2 0 1 1 1 2 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) .... (1) family ID (2) individual ID (3) father's ID (0=unknown) (4) mother's ID (0=unknown) (5) sex (1=male, 2=female) (6) affection status (0=unknown, 1=unaffected, 2=affected) (7-8), (9-10)... marker genotype (0 for missing alleles) Families should be numbered from 1 to F, without gaps. There is no limit on the number of individuals, but the number of families is set to be smaller than 500. To increase this limit, just change the value of MAXFAM in the MQLStest.c source file and recompile the program. Each individual should be entered only once. The number of columns should be the same for every individual: Use 0 for missing information (missing genotypes or unknown phenotype). The number of markers to be analyzed is determined using the number of columns on the first line of this file. There is a limit on the number of markers (set to 500). To increase the limit, just change the value of MAXMARK in the MQLS.c source file and recompile the program. Alleles at each marker locus should be numbered from 1 to M without gaps. There is no limit on the number of alleles at each marker locus. All the individuals who have either (1) known affection status or (2) at least some genotype information should also be listed in the kinship coefficient file (see below). 2. The kinship coefficient file (e.g. 'kinshipcoef') This file contains the kinship and inbreeding coefficients for all possible pairs of individuals, within each family, who have either (1) known affection status or (2) non-missing genotype for at least one marker. (E.g. an individual with unknown phenotype should still be included if he or she has any non-missing genotype information.) If an individual has an inbreeding coefficient equal to 0 or if a pair of individuals within a family has a kinship coefficient equal to 0, they must still be in the file or the individuals will not be included in the analysis. All pairs should be considered, regardless of phenotype (affected/affected, affected/unaffected and affected/unknown, unaffected/unaffected, unaffected/unknown, unknown/unknown) It has the following format : 1 1 1 0.0 1 1 2 0.25 1 1 3 0.25 1 1 7 0.25 1 1 6 0.25 1 2 2 0.0 1 2 3 0.25 1 2 7 0.25 1 2 6 0.25 1 3 3 0.0 1 3 7 0.25 1 3 6 0.25 1 7 7 0 1 7 6 0 1 6 6 0 1 3 3 1 2 1 1 0.01251 2 1 2 0.26124 . . . . . . . . (1) (2) (3) (4) (1) family ID (2) individual 1 ID (Id1) (3) individual 2 ID (Id2) (4) kinship coefficient between Id1 and Id2 if Id1 is different from Id2 inbreeding coefficient of Id1 if Id1 equals Id2 The family ID and individual ID should match exactly with the Id's in the marker data file. The program runs faster when the coefficients are ordered in the following way : In each family, the order of the pairs follow the order of the individuals given in the marker data file. Considering a family numbered 4 with 3 individuals 47, 48 and 49 listed in this order in the marker data file, the order in the kinship coefficient file would be : 4 47 47 H_47 4 47 48 phi_(47,48) 4 47 49 phi_(47,49) 4 48 48 phi_(48,48) 4 48 49 phi_(48,49) 4 49 49 phi_(49,49) Two software programs that can be used to obtain kinship and inbreeding coefficients are (1) The KinInbcoef software. The output file of the KinInbcoef program has the exact format required for the MQLS software. The KinInbcoef program can be found at http://www.stat.uchicago.edu/~mcpeek/software/KinInbcoef/index.html and (2) The idcoefs 2.0 software which can be found at http://home.uchicago.edu/~abney/abney_web/Software.html The idcoefs 2.0 software computes identity coefficients for pairs of individuals. Kinship and inbreeding coefficients can then easily be computed from the identity coefficients (the output from this software). 3. The prevalence file (e.g. 'prevalence') This file contains the estimate of the prevalence of the binary trait in the general population. This prevalence value is used in the calculation the MQLS statistic. 4. The MQLS software gives the user TWO OPTIONS for how to handle the individuals of unknown phenotype in the analysis. The user specifies the option, either 1 or 2, at the command line for running the executable program 'MQLStest'. See the 'MQLS_README' file for running the executable program with the two options. Below are details of the two options: OPTION 1: This should be considered the default for the MQLS test. Under this option, the MQLS test is performed with 3 different phenotype categories allowed: affected, unaffected, and unknown. This option is useful in that it allows individuals who are genotyped but not phenotyped to be included in the analysis. Furthermore, phenotyped individuals with missing genotype data are also allowed to contribute to the MQLS test with this option (if they have genotyped relatives in the sample). The WQLS and corrected chi-squared statistics are computed with the cases taken to be the affecteds and the controls taken to be the unknown and unaffected individuals combined. They do not make use of individuals with missing genotype data at the tested marker. OPTION 2: This option is provided for backward compatibility with the CC-QLS software for calculating WQLS and corrected chi-squared. In this option, individuals with unknown phenotype are excluded from all tests, and individuals with missing genotype data at a given marker are excluded from the test at that marker. If this option is run, results for WQLS and corrected chi-squared will be consistent with the output of the CC-QLS software (provided that there are no MZ twin pairs in the sample --- MZ twins are allowed when using the MQLS software but not in the CC-QLS software). Under option 2, the MQLS test will also be performed with these individuals removed from the analysis, which could reduce its power.