Results of the Data Randomization
| R | Number of genes from CGAP libraries log likelihood at least R | Mean number of genes from randomized libraries LogLik at least R | Believability |
| 13 | 3 | 0.003 | 99.9% |
| 12 | 4 | 0.005 | 99.9% |
| 11 | 5 | 0.009 | 99.8% |
| 10 | 10 | 0.03 | 99.7% |
| 9 | 14 | 0.1 | 99.0% |
| 8 | 21 | 0.4 | 98.2% |
| 7 | 36 | 1.1 | 97.0% |
| 6 | 74 | 6.3 | 91.5% |
| 5 | 120 | 16 | 86.3% |
| 4 | 275 | 49 | 82.2% |
| 3 | 997 | 421 | 57.8% |
| 2 | 1840 | 1347 | 26.8% |
| 1 | 9947 | 5294 | 46.8% |
-
Note. This table shows the results of the randomization procedure to test the believability of the genes for a given log likelihood ratio. The number of genes from the CGAP data set with log likelihood at least the value given in the first column is shown in the second column. The third column is the same, but averaged over 1000 runs of randomized data. The final column is a heuristic measure of believability, which is one minus the ratio of the number of genes from the randomized data to the number of genes from the CGAP data with at most the given log likelihood; this heuristic is only valid when the number of genes from the real data set is much greater than the number of genes from the randomized data. The 21 genes with log likelihood ratio at least 8 are listed in Table 2.











