UPDATE MARCH 2011: The experimental data is rather huge and I have not yet managed to migrate it away from my old CWI account, so there are some broken links in this directory. I'm working on it; if you want the files please contact me. ------ As discussed in the article, 110 networks were used for the experiment. These networks can be found in the directory 'inputNetworks'. They are provided as eNewick (.net) files. To obtain the set of all triplets in the network we first split a network with r reticulations into the 2^r possible trees that can be embedded into it. These trees are shown in the corresponding .tre file i.e. this is a file of 2^r trees given in Newick format. To obtain the corresponding triplets we fed the .tre file into our NewickToTriplet program, with the --noweight option specified. (We use --noweight because, given a network N, a triplet in T(N) should only have weight 1). The results of the two experiments are in directories 'missingDataExperiment' and 'noiseExperiment' respectively. ------------------------------------------------------------------------------------------------------------ *** Missing data experiment ----------------------------------------------------------------------------------------------------------- The file 'summary.txt' summarises, for each of the 110 networks, the measurements discussed in the article. 'sample prob' = the probability that a triplet is included in the input '% consistency' = network-triplet consistency (is equal to network-network consistency in this case) 'NT-SD' = network-triplet symmetric difference 'NN-SD' = network-network symmetric difference 'NormNN-SD' = normalized (network-network) symmetric difference 'mu-diff' = mu-difference The file "missing-consistency.eps" is a plot of average network-network triplet consistency for this experiment. The file "missing-normNN-SD.eps" is a plot of average normalized symmetric difference for this experiment. The file "missingdata-mu-table.txt" is a table with rows referring to the 96 considered networks (obtained by leaving out networks with size-4 galls) and the mu-difference under the epsilon parameter denoted by the column. In each subfolder you find a picture, for each of the 110 networks, of the networks output by LEV1ATHAN with the missing data paramater set accordingly. For example, in the folder HbR:0.6:--nopostprocess we see the results of the experiment when the inclusion probability was 0.6. -------------------------------------------------------------------------------------------------------- *** Noise experiment -------------------------------------------------------------------------------------------------------- 'error prob' = the probability that a triplet flips to one of the other two triplet probabilities '% consistency' = network-triplet consistency '% uncorrupted' = network-network triplet consistency NN-SD, NormNN-SD, mu-diff = same as in the Missing Data experiment The file "noise-consistency.eps" is a plot of average (network-network, network-triplet) triplet consistency for this experiment. The file "noise-normNN-SD.pdf" is a plot of average normalized symmetric difference for this experiment. The file "noise-mu-table.txt" is a table with rows referring to the 96 considered networks (obtained by leaving out networks with size-4 galls) and the mu-difference under the epsilon parameter denoted by the column. The naming convention of the subfolders follows a similar convention to the Missing Data experiment. ---------------------------------------------------------------------------------------------------- *** Summarising both experiments ---------------------------------------------------------------------------------------------------- The file "all_plots.pdf" in this directory contains: pages 1-4: the figures (already discussed above): missing-consistency.eps missing-normNN-SD.eps noise-consistency.eps noise-normNN-SD.pdf pages 5-6: exemplary mu-distance plots for each experiment for a single network (Network number: 104) pages 7-16: mu-distance plots for missing data experiment, 10 separate plots for 10 networks on each page (A-J) pages 17-26: mu-distance plots for noise experiment, 10 separate plots for 10 networks on each page (A-J) Finally, the data (averaging over all 96 networks) used to create the missing-consistency.eps, missing-normNN-SD.eps, noise-consistency.eps and noise-normNN-SD.pdf plots was obtained from the following two tables. (The tables were assembled by processing the two summary.txt files in the subdirectories). (i) MISSING: sample prob % consistency NT-SD NN-SD NormNN-SD mu-dist 1 100.000 0.000 0.000 0.000 0.000 0.9 98.377 10511.833 3926.042 0.052 58.958 0.8 98.381 17324.354 4135.813 0.053 63.896 0.7 98.336 24052.844 4288.385 0.056 69.042 0.6 98.324 30656.052 4307.396 0.058 76.063 0.5 98.558 37585.333 4546.188 0.057 82.292 0.4 98.386 44132.385 4598.104 0.060 89.125 0.3 98.428 50903.469 4774.604 0.061 94.896 0.2 98.348 57479.198 4908.260 0.065 106.625 0.1 98.126 64892.260 6365.115 0.091 119.229 NOISE: error prob % consistency % uncorrupted NN-SD Norm NN-SD mu-dist 0 100.00 100.00 0.00 0.000 0.00 0.01 97.07 97.89 9535.97 0.123 73.19 0.02 96.17 97.83 9340.52 0.123 85.17 0.03 95.40 97.86 10252.49 0.129 89.44 0.04 94.57 97.85 10198.16 0.132 93.56 0.05 93.69 97.82 10630.07 0.137 94.25 0.06 92.99 97.91 10655.10 0.135 100.40 0.07 92.05 97.79 11127.09 0.141 99.46 0.08 91.31 97.86 11563.63 0.143 101.71 0.09 90.39 97.77 11502.21 0.146 101.60 0.10 89.62 97.82 11765.10 0.147 104.83 0.15 85.74 97.89 13406.94 0.159 109.81 0.20 81.63 97.60 14606.25 0.177 114.15 0.25 77.78 97.50 15868.45 0.191 118.60 0.30 73.89 96.67 18961.92 0.226 125.13 0.35 69.41 94.35 23652.57 0.273 131.15 0.40 61.73 87.30 29600.10 0.332 142.56 0.45 55.23 83.01 29582.90 0.347 150.60 0.50 50.88 80.68 30236.22 0.361 155.73