Model: Random forest

Accuracy

Random forest accuracy on validation data was 0.754, and on test data 0.75.

Accuracy per category

plot of chunk unnamed-chunk-4 plot of chunk unnamed-chunk-4 plot of chunk unnamed-chunk-4 plot of chunk unnamed-chunk-4

Class Precision Number of cell images Correct calls Recall Accuracy
cell periphery 0.842 1569 1158 0.738 0.950
cytoplasm 0.735 1276 1080 0.846 0.953
endosome 0.539 689 326 0.473 0.949
er 0.808 1755 1315 0.749 0.940
golgi 0.695 382 291 0.762 0.982
mitochondrion 0.663 1243 967 0.778 0.939
nuclear periphery 0.781 1164 964 0.828 0.962
nucleolus 0.893 1263 1066 0.844 0.974
nucleus 0.862 1627 1356 0.833 0.961
peroxisome 0.442 164 61 0.372 0.986
spindle pole 0.585 781 439 0.562 0.948
vacuole 0.537 587 352 0.600 0.957

Across classes, the mean and median precision were 0.7 and 0.71 respectively, recall 0.7 and 0.76, and accuracy 0.96 and 0.96.

Confusion matrix

Each element \(M_{ij}\) in the confusion matrix \(M\) shows the number of observations from class \(i\) which were predicted as class \(j\). The confusion matrix in the figure below has been normalized by class support size (each row has been divided by the number of elements in that class).

plot of chunk unnamed-chunk-5

cell periphery cytoplasm endosome er golgi mitochondrion nuclear periphery nucleolus nucleus peroxisome spindle pole vacuole
cell periphery 1158 127 6 112 81 42 7 1 11 0 1 23
cytoplasm 32 1080 9 82 5 23 3 0 10 0 4 28
endosome 15 44 326 13 7 156 15 0 6 11 75 21
er 158 128 26 1315 21 23 14 0 2 0 0 68
golgi 3 3 0 29 291 18 1 6 0 0 0 31
mitochondrion 3 9 42 14 7 967 75 14 35 17 24 36
nuclear periphery 2 3 2 31 2 20 964 2 83 2 4 49
nucleolus 0 5 2 2 0 23 7 1066 43 2 101 12
nucleus 4 23 13 10 2 27 41 101 1356 1 17 32
peroxisome 0 0 1 0 0 19 1 0 0 61 82 0
spindle pole 0 42 168 3 0 72 1 0 9 44 439 3
vacuole 0 5 10 17 3 69 105 4 18 0 4 352

To see misclassification patterns more clearly, the diagonal of the confusion matrix has been excluded in the following figure.

plot of chunk unnamed-chunk-6

Protein classification accuracy

Across proteins, the mean and median accuracy were 0.954 and 1.000, respectively.

Class Number of proteins Correct calls Accuracy
cell periphery 7 6 0.857
cytoplasm 108 105 0.972
endosome 2 1 0.500
er 29 27 0.931
golgi 2 2 1.000
mitochondrion 45 45 1.000
nuclear periphery 5 5 1.000
nucleolus 7 6 0.857
nucleus 69 66 0.957
peroxisome 1 0 0.000
spindle pole 2 2 1.000
vacuole 5 4 0.800

Protein classification accuracy (at least 10 cell images)

Across proteins with at least 10 cell images, the mean and median accuracy were 0.964 and 1.000, respectively.

Class Number of proteins Correct calls Accuracy
cell periphery 7 6 0.857
cytoplasm 66 64 0.970
endosome 2 1 0.500
er 27 26 0.963
golgi 2 2 1.000
mitochondrion 42 42 1.000
nuclear periphery 5 5 1.000
nucleolus 7 6 0.857
nucleus 57 56 0.982
peroxisome 1 0 0.000
spindle pole 2 2 1.000
vacuole 4 4 1.000

Misclassifications - false negatives

Original and scaled cell images, respectively.

## [1] "cell periphery, false negative rate: 0.262"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "cytoplasm, false negative rate: 0.154"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "endosome, false negative rate: 0.527"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "er, false negative rate: 0.251"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "golgi, false negative rate: 0.238"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "mitochondrion, false negative rate: 0.222"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "nuclear periphery, false negative rate: 0.172"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "nucleolus, false negative rate: 0.156"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "nucleus, false negative rate: 0.167"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "peroxisome, false negative rate: 0.628"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "spindle pole, false negative rate: 0.438"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

## [1] "vacuole, false negative rate: 0.4"

plot of chunk unnamed-chunk-12 plot of chunk unnamed-chunk-12

Misclassifications - false positives

## [1] "cell periphery"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "cytoplasm"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "endosome"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "er"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "golgi"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "mitochondrion"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "nuclear periphery"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "nucleolus"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "nucleus"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "peroxisome"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "spindle pole"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

## [1] "vacuole"

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

Common mistakes (top 5)

## [1] "True class: peroxisome (total images 164), predicted as: spindle pole, count: 82 (50.0% error)"

plot of chunk unnamed-chunk-14 plot of chunk unnamed-chunk-14

## [1] "True class: endosome (total images 689), predicted as: mitochondrion, count: 156 (22.6% error)"

plot of chunk unnamed-chunk-14 plot of chunk unnamed-chunk-14

## [1] "True class: spindle pole (total images 781), predicted as: endosome, count: 168 (21.5% error)"

plot of chunk unnamed-chunk-14 plot of chunk unnamed-chunk-14

## [1] "True class: vacuole (total images 587), predicted as: nuclear periphery, count: 105 (17.9% error)"

plot of chunk unnamed-chunk-14 plot of chunk unnamed-chunk-14

## [1] "True class: vacuole (total images 587), predicted as: mitochondrion, count: 69 (11.8% error)"

plot of chunk unnamed-chunk-14 plot of chunk unnamed-chunk-14