Binding Intensity Only Tile array analysis or "BioTile" is an algorithm written in perl designed for the identification of differentially enriched regions (DERs) in tiling array data.

BioTile requires both the user's data file and Annotation file to be properly formatted in order to run properly. Please follow the following steps to use BioTile:

Data should be tab delimited and adhere to the following column order:

```
Data Columns:
1.) chromosome
2.) chromosomal coordinate*
3.) ID column
4 and on.) data
Note: * chromosomal coordinates must be in ascending order per chromosome but chromosomes need not be in order
```

```
CHR START UniqueID case1 case2 case3 case4 case5 con1 con2....
chr1 3260443 chr1-3260443 5.756197581 6.073657606 5.072443511 4.566772967 4.887691227 5.361445566 4.908736039
chr1 3260484 chr1-3260484 10.37899959 2.812171903 7.484823045 3.51284835 2.914004916 9.009255134 4.744176783
chr1 3260518 chr1-3260518 8.330929208 5.948986357 7.055280556 7.780053389 8.27809947 7.947379971 7.709037061
chr1 3260549 chr1-3260549 5.321461034 5.410797637 4.457120359 5.090445555 5.276087083 3.718002551 4.473440208
chr1 3260581 chr1-3260581 9.844482722 9.524443421 10.08292572 9.659773611 10.43945409 9.989464018 9.223844286
```

The data structure of the Annotation file must be retained and the order and diagnosis variables must match the number of data columns in the data file

```
1.)Datafile: Input the name of your datafile. Your datafile should be in the same working directory as BioTile.pl
2.) Outfile: Input the name of the output file where your data will be saved.
3.) Minimum Probes/DMR: Enter the minimum number of probes to be included as a DMR. The minimum is 3, which is recommended to find small
differences.
4.) Iterations for P value: Enter the number of iterations desired for P value generation. The default and recommended number is 1000.
Varying this value will largely impact processing time.
5.) Spacing between probes (bp): Enter the spacing between probes in base pairs on the microarray platform used to generate the data.
This value will ensure no large gaps are included in identified DMRs.
6.) Adjust for Covariates: Specify y/n to adjust for covariates prior to DMR identification and statistical testing
7.) Independent Variable Continous: Specify y/n if the independent variable in the "diag" column is continous and should be tested with
a linear model. Specifying "y" with binary classifiers will return the same results as "n" but may increase processing time.
8.) Specify ID, independent variable (diag), order, and any optional covariates in tab delimited table format below as per the example:
```

```
##Please Enter Specifics Below
Datafile= Example_Data_Set.txt
Outfile= Analysis_Output.txt
Minimum Probes/DMR= 3
Iterations for P value= 1000
Spacing between probes (bp)= 35
Adjust for Covariates= n
Independent Variable Continuous= n
ID diag order Covariates 1 Covariates2 Covariates3 Covariates4
1 1 1 1
2 1 2 1
3 1 3 2
4 1 4 2
5 1 5 1
6 2 6 2
7 2 7 1
8 2 8 2
9 2 9 2
10 2 10 1
```

From the command line interface, call: perl BioTile.pl

Note: The example dataset provided contains the top 5000 loci and 5 randomly selected cases and control values from the simulated dataset interrogated in the published paper

```
1.) The chromosome of the identified DMR
2.) The start position of the identified DMR
3.) The end position of the identified DMR
4.) The number of probes in the identified DMR
5.) The mean effect size of the identified DMR *
6.) The maximum effect size within the identified DMR
7.) The position of the probe with the maximum effect size
8.) The p value obtained from permutation testing
9.) The Q statistic of the meta-analysis for the identified DMR. If multiple P values are equal to 0,
those with higher Q statistics most likely represent larger differences over longer DMRs.
*Note: The mean effect size will represent values of the independent variable coded as 1 minus that coded as 2.
If continous was was specified for the independent variable, the mean effect will represent the mean slope
of the linear models across probes at the identified DMR.
```

```
Chr Start End #Probes Mean Effect (1-2) Max Effect Max Pos P value Q statistic
chr1 3260655 3260722 3 0.206738228866666 0.3090028726 3260655 0.889110889110889 0.307765637851882
chr1 3658926 3658992 3 0.115758363 0.1658440818 3658926 0.942057942057942 0.116155659174058
chr1 3662743 3662820 3 0.387891732133333 0.8852050682 3662820 0.891108891108891 0.273096675378496
chr1 3662968 3663088 4 0.38327091275 0.9043733208 3663047 0.137862137862138 6.14582222894925
chr1 3665310 3665418 4 0.50777318565 1.132457159 3665382 0.378621378621379 3.39035486298906
```