# Credal Classification

First, we need to introduce a small tolerance when comparing floating point values, to account for numerical approximations in the code. Here you can set the global value:

In [1]:
TOL = 1e-6

## Data Set

To demonstrate credal classification, we will use the following [breast cancer dataset](http://archive.ics.uci.edu/ml/datasets/mammographic+mass). Note: if you can, hide the next cell for easier navigation.

In [2]:
csv_file = """5,67,3,5,3,1
4,43,1,1,?,1
5,58,4,5,3,1
4,28,1,1,3,0
5,74,1,5,?,1
4,65,1,?,3,0
4,70,?,?,3,0
5,42,1,?,3,0
5,57,1,5,3,1
5,60,?,5,1,1
5,76,1,4,3,1
3,42,2,1,3,1
4,64,1,?,3,0
4,36,3,1,2,0
4,60,2,1,2,0
4,54,1,1,3,0
3,52,3,4,3,0
4,59,2,1,3,1
4,54,1,1,3,1
4,40,1,?,?,0
?,66,?,?,1,1
5,56,4,3,1,1
4,43,1,?,?,0
5,42,4,4,3,1
4,59,2,4,3,1
5,75,4,5,3,1
2,66,1,1,?,0
5,63,3,?,3,0
5,45,4,5,3,1
5,55,4,4,3,0
4,46,1,5,2,0
5,54,4,4,3,1
5,57,4,4,3,1
4,39,1,1,2,0
4,81,1,1,3,0
4,77,3,?,?,0
4,60,2,1,3,0
5,67,3,4,2,1
4,48,4,5,?,1
4,55,3,4,2,0
4,59,2,1,?,0
4,78,1,1,1,0
4,50,1,1,3,0
4,61,2,1,?,0
5,62,3,5,2,1
5,44,2,4,?,1
5,64,4,5,3,1
4,23,1,1,?,0
2,42,?,?,4,0
5,67,4,5,3,1
4,74,2,1,2,0
5,80,3,5,3,1
4,23,1,1,?,0
4,63,2,1,?,0
4,53,?,5,3,1
4,43,3,4,?,0
4,49,2,1,1,0
5,51,2,4,?,0
4,45,2,1,?,0
5,59,2,?,?,1
5,52,4,3,3,1
5,60,4,3,3,1
4,57,2,5,3,0
3,57,2,1,?,0
5,74,4,4,3,1
4,25,2,1,?,0
4,49,1,1,3,0
5,72,4,3,?,1
4,45,2,1,3,0
4,64,2,1,3,0
4,73,2,1,2,0
5,68,4,3,3,1
5,52,4,5,3,0
5,66,4,4,3,1
5,70,?,4,?,1
4,25,1,1,3,0
5,74,1,1,2,1
4,64,1,1,3,0
5,60,4,3,2,1
5,67,2,4,1,0
4,67,4,5,3,0
5,44,4,4,2,1
3,68,1,1,3,1
4,57,?,4,1,0
5,51,4,?,?,1
4,33,1,?,?,0
5,58,4,4,3,1
5,36,1,?,?,0
4,63,1,1,?,0
5,62,1,5,3,1
4,73,3,4,3,1
4,80,4,4,3,1
4,67,1,1,?,0
5,59,2,1,3,1
5,60,1,?,3,0
5,54,4,4,3,1
4,40,1,1,?,0
4,47,2,1,?,0
5,62,4,4,3,0
4,33,2,1,3,0
5,59,2,?,?,0
4,65,2,?,?,0
4,58,4,4,?,0
4,29,2,?,?,0
4,58,1,1,?,0
4,54,1,1,?,0
4,44,1,1,?,1
3,34,2,1,?,0
4,57,1,1,3,0
5,33,4,4,?,1
4,45,4,4,3,0
5,71,4,4,3,1
5,59,4,4,2,0
4,56,2,1,?,0
4,40,3,4,?,0
4,56,1,1,3,0
4,45,2,1,?,0
4,57,2,1,2,0
5,55,3,4,3,1
5,84,4,5,3,0
5,51,4,4,3,1
4,43,1,1,?,0
4,24,2,1,2,0
4,66,1,1,3,0
5,33,4,4,3,0
4,59,4,3,2,0
4,76,2,3,?,0
4,40,1,1,?,0
4,52,?,4,?,0
5,40,4,5,3,1
5,67,4,4,3,1
5,75,4,3,3,1
5,86,4,4,3,0
4,60,2,?,?,0
5,66,4,4,3,1
5,46,4,5,3,1
4,59,4,4,3,1
5,65,4,4,3,1
4,53,1,1,3,0
5,67,3,5,3,1
5,80,4,5,3,1
4,55,2,1,3,0
4,48,1,1,?,0
4,47,1,1,2,0
4,50,2,1,?,0
5,62,4,5,3,1
5,63,4,4,3,1
4,63,4,?,3,1
4,71,4,4,3,1
4,41,1,1,3,0
5,57,4,4,4,1
5,71,4,4,4,1
4,66,1,1,3,0
4,47,2,4,2,0
3,34,4,4,3,0
4,59,3,4,3,0
5,55,2,?,?,1
4,51,?,?,3,0
4,62,2,1,?,0
4,58,4,?,3,1
5,67,4,4,3,1
4,41,2,1,3,0
4,23,3,1,3,0
4,53,?,4,3,0
4,42,2,1,3,0
5,87,4,5,3,1
4,68,1,1,3,1
4,64,1,1,3,0
5,54,3,5,3,1
5,86,4,5,3,1
4,21,2,1,3,0
4,39,1,1,?,0
4,53,4,4,3,0
4,44,4,4,3,0
4,54,1,1,3,0
5,63,4,5,3,1
4,62,2,1,?,0
4,45,2,1,2,0
5,71,4,5,3,0
5,49,4,4,3,1
4,49,4,4,3,0
5,66,4,4,4,0
4,19,1,1,3,0
4,35,1,1,2,0
4,71,3,3,?,1
5,74,4,5,3,1
5,37,4,4,3,1
4,67,1,?,3,0
5,81,3,4,3,1
5,59,4,4,3,1
4,34,1,1,3,0
5,79,4,3,3,1
5,60,3,1,3,0
4,41,1,1,3,1
4,50,1,1,3,0
5,85,4,4,3,1
4,46,1,1,3,0
5,66,4,4,3,1
4,73,3,1,2,0
4,55,1,1,3,0
4,49,2,1,3,0
3,49,4,4,3,0
4,51,4,5,3,1
2,48,4,4,3,0
4,58,4,5,3,0
5,72,4,5,3,1
4,46,2,3,3,0
4,43,4,3,3,1
?,52,4,4,3,0
4,66,2,1,?,0
4,46,1,1,1,0
4,69,3,1,3,0
2,59,1,1,?,1
5,43,2,1,3,1
5,76,4,5,3,1
4,46,1,1,3,0
4,59,2,4,3,0
4,57,1,1,3,0
5,43,4,5,?,0
3,45,2,1,3,0
3,43,2,1,3,0
4,45,2,1,3,0
5,57,4,5,3,1
5,79,4,4,3,1
5,54,2,1,3,1
4,40,3,4,3,0
5,63,4,4,3,1
2,55,1,?,1,0
4,52,2,1,3,0
4,38,1,1,3,0
3,72,4,3,3,0
5,80,4,3,3,1
5,76,4,3,3,1
4,62,3,1,3,0
5,64,4,5,3,1
5,42,4,5,3,0
3,60,?,3,1,0
4,64,4,5,3,0
4,63,4,4,3,1
4,24,2,1,2,0
5,72,4,4,3,1
4,63,2,1,3,0
4,46,1,1,3,0
3,33,1,1,3,0
5,76,4,4,3,1
4,36,2,3,3,0
4,40,2,1,3,0
5,58,1,5,3,1
4,43,2,1,3,0
3,42,1,1,3,0
4,32,1,1,3,0
5,57,4,4,2,1
4,37,1,1,3,0
4,70,4,4,3,1
5,56,4,2,3,1
3,76,?,3,2,0
5,73,4,4,3,1
5,77,4,5,3,1
5,67,4,4,1,1
5,71,4,3,3,1
5,65,4,4,3,1
4,43,1,1,3,0
4,40,2,1,?,0
4,49,2,1,3,0
5,76,4,2,3,1
4,55,4,4,3,0
5,72,4,5,3,1
3,53,4,3,3,0
5,75,4,4,3,1
5,61,4,5,3,1
5,67,4,4,3,1
5,55,4,2,3,1
5,66,4,4,3,1
2,76,1,1,2,0
4,57,4,4,3,1
5,71,3,1,3,0
5,70,4,5,3,1
4,35,4,2,?,0
5,79,1,?,3,1
4,63,2,1,3,0
5,40,1,4,3,1
4,41,1,1,3,0
4,47,2,1,2,0
4,68,1,1,3,1
4,64,4,3,3,1
4,65,4,4,?,1
4,73,4,3,3,0
4,39,4,3,3,0
5,55,4,5,4,1
5,53,3,4,4,0
5,66,4,4,3,1
4,43,3,1,2,0
5,44,4,5,3,1
4,77,4,4,3,1
4,62,2,4,3,0
5,80,4,4,3,1
4,33,4,4,3,0
4,50,4,5,3,1
4,71,1,?,3,0
5,46,4,4,3,1
5,49,4,5,3,1
4,53,1,1,3,0
3,46,2,1,2,0
4,57,1,1,3,0
4,54,3,1,3,0
4,54,1,?,?,0
2,49,2,1,2,0
4,47,3,1,3,0
4,40,1,1,3,0
4,45,1,1,3,0
4,50,4,5,3,1
5,54,4,4,3,1
4,67,4,1,3,1
4,77,4,4,3,1
4,66,4,3,3,0
4,71,2,?,3,1
4,36,2,3,3,0
4,69,4,4,3,0
4,48,1,1,3,0
4,64,4,4,3,1
4,71,4,2,3,1
5,60,4,3,3,1
4,24,1,1,3,0
5,34,4,5,2,1
4,79,1,1,2,0
4,45,1,1,3,0
4,37,2,1,2,0
4,42,1,1,2,0
4,72,4,4,3,1
5,60,4,5,3,1
5,85,3,5,3,1
4,51,1,1,3,0
5,54,4,5,3,1
5,55,4,3,3,1
4,64,4,4,3,0
5,67,4,5,3,1
5,75,4,3,3,1
5,87,4,4,3,1
4,46,4,4,3,1
4,59,2,1,?,0
55,46,4,3,3,1
5,61,1,1,3,1
4,44,1,4,3,0
4,32,1,1,3,0
4,62,1,1,3,0
5,59,4,5,3,1
4,61,4,1,3,0
5,78,4,4,3,1
5,42,4,5,3,0
4,45,1,2,3,0
5,34,2,1,3,1
5,39,4,3,?,1
4,27,3,1,3,0
4,43,1,1,3,0
5,83,4,4,3,1
4,36,2,1,3,0
4,37,2,1,3,0
4,56,3,1,3,1
5,55,4,4,3,1
5,46,3,?,3,0
4,88,4,4,3,1
5,71,4,4,3,1
4,41,2,1,3,0
5,49,4,4,3,1
3,51,1,1,4,0
4,39,1,3,3,0
4,46,2,1,3,0
5,52,4,4,3,1
5,58,4,4,3,1
4,67,4,5,3,1
5,80,4,4,3,1
3,46,1,?,?,0
3,43,1,?,?,0
4,45,1,1,3,0
5,68,4,4,3,1
4,54,4,4,?,1
4,44,2,3,3,0
5,74,4,3,3,1
5,55,4,5,3,0
4,49,4,4,3,1
4,49,1,1,3,0
5,50,4,3,3,1
5,52,3,5,3,1
4,45,1,1,3,0
4,66,1,1,3,0
4,68,4,4,3,1
4,72,2,1,3,0
5,64,?,?,3,0
2,49,?,3,3,0
3,44,?,4,3,0
5,74,4,4,3,1
5,58,4,4,3,1
4,77,2,3,3,0
4,49,3,1,3,0
4,34,?,?,4,0
5,60,4,3,3,1
5,69,4,3,3,1
4,53,2,1,3,0
3,46,3,4,3,0
5,74,4,4,3,1
4,58,1,1,3,0
5,68,4,4,3,1
5,46,4,3,3,0
5,61,2,4,3,1
5,70,4,3,3,1
5,37,4,4,3,1
3,65,4,5,3,1
4,67,4,4,3,0
5,69,3,4,3,0
5,76,4,4,3,1
4,65,4,3,3,0
5,72,4,2,3,1
4,62,4,2,3,0
5,42,4,4,3,1
5,66,4,3,3,1
5,48,4,4,3,1
4,35,1,1,3,0
5,60,4,4,3,1
5,67,4,2,3,1
5,78,4,4,3,1
4,66,1,1,3,1
4,26,1,1,?,0
4,48,1,1,3,0
4,31,1,1,3,0
5,43,4,3,3,1
5,72,2,4,3,0
5,66,1,1,3,1
4,56,4,4,3,0
5,58,4,5,3,1
5,33,2,4,3,1
4,37,1,1,3,0
5,36,4,3,3,1
4,39,2,3,3,0
4,39,4,4,3,1
5,83,4,4,3,1
4,68,4,5,3,1
5,63,3,4,3,1
5,78,4,4,3,1
4,38,2,3,3,0
5,46,4,3,3,1
5,60,4,4,3,1
5,56,2,3,3,1
4,33,1,1,3,0
4,?,4,5,3,1
4,69,1,5,3,1
5,66,1,4,3,1
4,72,1,3,3,0
4,29,1,1,3,0
5,54,4,5,3,1
5,80,4,4,3,1
5,68,4,3,3,1
4,35,2,1,3,0
4,57,3,?,3,0
5,?,4,4,3,1
4,50,1,1,3,0
4,32,4,3,3,0
0,69,4,5,3,1
4,71,4,5,3,1
5,87,4,5,3,1
3,40,2,?,3,0
4,31,1,1,?,0
4,64,1,1,3,0
5,55,4,5,3,1
4,18,1,1,3,0
3,50,2,1,?,0
4,53,1,1,3,0
5,84,4,5,3,1
5,80,4,3,3,1
4,32,1,1,3,0
5,77,3,4,3,1
4,38,1,1,3,0
5,54,4,5,3,1
4,63,1,1,3,0
4,61,1,1,3,0
4,52,1,1,3,0
4,36,1,1,3,0
4,41,?,?,3,0
4,59,1,1,3,0
5,51,4,4,2,1
4,36,1,1,3,0
5,40,4,3,3,1
4,49,1,1,3,0
4,37,2,3,3,0
4,46,1,1,3,0
4,63,1,1,3,0
4,28,2,1,3,0
4,47,2,1,3,0
4,42,2,1,3,1
5,44,4,5,3,1
4,49,4,4,3,0
5,47,4,5,3,1
5,52,4,5,3,1
4,53,1,1,3,1
5,83,3,3,3,1
4,50,4,4,?,1
5,63,4,4,3,1
4,82,?,5,3,1
4,54,1,1,3,0
4,50,4,4,3,0
5,80,4,5,3,1
5,45,2,4,3,0
5,59,4,4,?,1
4,28,2,1,3,0
4,31,1,1,3,0
4,41,2,1,3,0
4,21,3,1,3,0
5,44,3,4,3,1
5,49,4,4,3,1
5,71,4,5,3,1
5,75,4,5,3,1
4,38,2,1,3,0
4,60,1,3,3,0
5,87,4,5,3,1
4,70,4,4,3,1
5,55,4,5,3,1
3,21,1,1,3,0
4,50,1,1,3,0
5,76,4,5,3,1
4,23,1,1,3,0
3,68,?,?,3,0
4,62,4,?,3,1
5,65,1,?,3,1
5,73,4,5,3,1
4,38,2,3,3,0
2,57,1,1,3,0
5,65,4,5,3,1
5,67,2,4,3,1
5,61,2,4,3,1
5,56,4,4,3,0
5,71,2,4,3,1
4,49,2,2,3,0
4,55,?,?,3,0
4,44,2,1,3,0
0,58,4,4,3,0
4,27,2,1,3,0
5,73,4,5,3,1
4,34,2,1,3,0
5,63,?,4,3,1
4,50,2,1,3,1
4,62,2,1,3,0
3,21,3,1,3,0
4,49,2,?,3,0
4,36,3,1,3,0
4,45,2,1,3,1
5,67,4,5,3,1
4,21,1,1,3,0
4,57,2,1,3,0
5,66,4,5,3,1
4,71,4,4,3,1
5,69,3,4,3,1
6,80,4,5,3,1
3,27,2,1,3,0
4,38,2,1,3,0
4,23,2,1,3,0
5,70,?,5,3,1
4,46,4,3,3,0
4,61,2,3,3,0
5,65,4,5,3,1
4,60,4,3,3,0
5,83,4,5,3,1
5,40,4,4,3,1
2,59,?,4,3,0
4,53,3,4,3,0
4,76,4,4,3,0
5,79,1,4,3,1
5,38,2,4,3,1
4,61,3,4,3,0
4,56,2,1,3,0
4,44,2,1,3,0
4,64,3,4,?,1
4,66,3,3,3,0
4,50,3,3,3,0
4,46,1,1,3,0
4,39,1,1,3,0
4,60,3,?,?,0
5,55,4,5,3,1
4,40,2,1,3,0
4,26,1,1,3,0
5,84,3,2,3,1
4,41,2,2,3,0
4,63,1,1,3,0
2,65,?,1,2,0
4,49,1,1,3,0
4,56,2,2,3,1
5,65,4,4,3,0
4,54,1,1,3,0
4,36,1,1,3,0
5,49,4,4,3,0
4,59,4,4,3,1
5,75,4,4,3,1
5,59,4,2,3,0
5,59,4,4,3,1
4,28,4,4,3,1
5,53,4,5,3,0
5,57,4,4,3,0
5,77,4,3,4,0
5,85,4,3,3,1
4,59,4,4,3,0
5,59,1,5,3,1
4,65,3,3,3,1
4,54,2,1,3,0
5,46,4,5,3,1
4,63,4,4,3,1
4,53,1,1,3,1
4,56,1,1,3,0
5,66,4,4,3,1
5,66,4,5,3,1
4,55,1,1,3,0
4,44,1,1,3,0
5,86,3,4,3,1
5,47,4,5,3,1
5,59,4,5,3,1
5,66,4,5,3,0
5,61,4,3,3,1
3,46,?,5,?,1
4,69,1,1,3,0
5,93,1,5,3,1
4,39,1,3,3,0
5,44,4,5,3,1
4,45,2,2,3,0
4,51,3,4,3,0
4,56,2,4,3,0
4,66,4,4,3,0
5,61,4,5,3,1
4,64,3,3,3,1
5,57,2,4,3,0
5,79,4,4,3,1
4,57,2,1,?,0
4,44,4,1,1,0
4,31,2,1,3,0
4,63,4,4,3,0
4,64,1,1,3,0
5,47,4,5,3,0
5,68,4,5,3,1
4,30,1,1,3,0
5,43,4,5,3,1
4,56,1,1,3,0
4,46,2,1,3,0
4,67,2,1,3,0
5,52,4,5,3,1
4,67,4,4,3,1
4,47,2,1,3,0
5,58,4,5,3,1
4,28,2,1,3,0
4,43,1,1,3,0
4,57,2,4,3,0
5,68,4,5,3,1
4,64,2,4,3,0
4,64,2,4,3,0
5,62,4,4,3,1
4,38,4,1,3,0
5,68,4,4,3,1
4,41,2,1,3,0
4,35,2,1,3,1
4,68,2,1,3,0
5,55,4,4,3,1
5,67,4,4,3,1
4,51,4,3,3,0
2,40,1,1,3,0
5,73,4,4,3,1
4,58,?,4,3,1
4,51,?,4,3,0
3,50,?,?,3,1
5,59,4,3,3,1
6,60,3,5,3,1
4,27,2,1,?,0
5,54,4,3,3,0
4,56,1,1,3,0
5,53,4,5,3,1
4,54,2,4,3,0
5,79,1,4,3,1
5,67,4,3,3,1
5,64,3,3,3,1
4,70,1,2,3,1
5,55,4,3,3,1
5,65,3,3,3,1
5,45,4,2,3,1
4,57,4,4,?,1
5,49,1,1,3,1
4,24,2,1,3,0
4,52,1,1,3,0
4,50,2,1,3,0
4,35,1,1,3,0
5,?,3,3,3,1
5,64,4,3,3,1
5,40,4,1,1,1
5,66,4,4,3,1
4,64,4,4,3,1
5,52,4,3,3,1
5,43,1,4,3,1
4,56,4,4,3,0
4,72,3,?,3,0
6,51,4,4,3,1
4,79,4,4,3,1
4,22,2,1,3,0
4,73,2,1,3,0
4,53,3,4,3,0
4,59,2,1,3,1
4,46,4,4,2,0
5,66,4,4,3,1
4,50,4,3,3,1
4,58,1,1,3,1
4,55,1,1,3,0
4,62,2,4,3,1
4,60,1,1,3,0
5,57,4,3,3,1
4,57,1,1,3,0
6,41,2,1,3,0
4,71,2,1,3,1
4,32,2,1,3,0
4,57,2,1,3,0
4,19,1,1,3,0
4,62,2,4,3,1
5,67,4,5,3,1
4,50,4,5,3,0
4,65,2,3,2,0
4,40,2,4,2,0
6,71,4,4,3,1
6,68,4,3,3,1
4,68,1,1,3,0
4,29,1,1,3,0
4,53,2,1,3,0
5,66,4,4,3,1
4,60,3,?,4,0
5,76,4,4,3,1
4,58,2,1,2,0
5,96,3,4,3,1
5,70,4,4,3,1
4,34,2,1,3,0
4,59,2,1,3,0
4,45,3,1,3,1
5,65,4,4,3,1
4,59,1,1,3,0
4,21,2,1,3,0
3,43,2,1,3,0
4,53,1,1,3,0
4,65,2,1,3,0
4,64,2,4,3,1
4,53,4,4,3,0
4,51,1,1,3,0
4,59,2,4,3,0
4,56,2,1,3,0
4,60,2,1,3,0
4,22,1,1,3,0
4,25,2,1,3,0
6,76,3,?,3,0
5,69,4,4,3,1
4,58,2,1,3,0
5,62,4,3,3,1
4,56,4,4,3,0
4,64,1,1,3,0
4,32,2,1,3,0
5,48,?,4,?,1
5,59,4,4,2,1
4,52,1,1,3,0
4,63,4,4,3,0
5,67,4,4,3,1
5,61,4,4,3,1
5,59,4,5,3,1
5,52,4,3,3,1
4,35,4,4,3,0
5,77,3,3,3,1
5,71,4,3,3,1
5,63,4,3,3,1
4,38,2,1,2,0
5,72,4,3,3,1
4,76,4,3,3,1
4,53,3,3,3,0
4,67,4,5,3,0
5,69,2,4,3,1
4,54,1,1,3,0
2,35,2,1,2,0
5,68,4,3,3,1
4,68,4,4,3,0
4,67,2,4,3,1
3,39,1,1,3,0
4,44,2,1,3,0
4,33,1,1,3,0
4,60,?,4,3,0
4,58,1,1,3,0
4,31,1,1,3,0
3,23,1,1,3,0
5,56,4,5,3,1
4,69,2,1,3,1
6,63,1,1,3,0
4,65,1,1,3,1
4,44,2,1,2,0
4,62,3,3,3,1
4,67,4,4,3,1
4,56,2,1,3,0
4,52,3,4,3,0
4,43,1,1,3,1
4,41,4,3,2,1
4,42,3,4,2,0
3,46,1,1,3,0
5,55,4,4,3,1
5,58,4,4,2,1
5,87,4,4,3,1
4,66,2,1,3,0
0,72,4,3,3,1
5,60,4,3,3,1
5,83,4,4,2,1
4,31,2,1,3,0
4,53,2,1,3,0
4,64,2,3,3,0
5,31,4,4,2,1
5,62,4,4,2,1
4,56,2,1,3,0
5,58,4,4,3,1
4,67,1,4,3,0
5,75,4,5,3,1
5,65,3,4,3,1
5,74,3,2,3,1
4,59,2,1,3,0
4,57,4,4,4,1
4,76,3,2,3,0
4,63,1,4,3,0
4,44,1,1,3,0
4,42,3,1,2,0
4,35,3,?,2,0
5,65,4,3,3,1
4,70,2,1,3,0
4,48,1,1,3,0
4,74,1,1,1,1
6,40,?,3,4,1
4,63,1,1,3,0
5,60,4,4,3,1
5,86,4,3,3,1
4,27,1,1,3,0
4,71,4,5,2,1
5,85,4,4,3,1
4,51,3,3,3,0
6,72,4,3,3,1
5,52,4,4,3,1
4,66,2,1,3,0
5,71,4,5,3,1
4,42,2,1,3,0
4,64,4,4,2,1
4,41,2,2,3,0
4,50,2,1,3,0
4,30,1,1,3,0
4,67,1,1,3,0
5,62,4,4,3,1
4,46,2,1,2,0
4,35,1,1,3,0
4,53,1,1,2,0
4,59,2,1,3,0
4,19,3,1,3,0
5,86,2,1,3,1
4,72,2,1,3,0
4,37,2,1,2,0
4,46,3,1,3,1
4,45,1,1,3,0
4,48,4,5,3,0
4,58,4,4,3,1
4,42,1,1,3,0
4,56,2,4,3,1
4,47,2,1,3,0
4,49,4,4,3,1
5,76,2,5,3,1
5,62,4,5,3,1
5,64,4,4,3,1
5,53,4,3,3,1
4,70,4,2,2,1
5,55,4,4,3,1
4,34,4,4,3,0
5,76,4,4,3,1
4,39,1,1,3,0
2,23,1,1,3,0
4,19,1,1,3,0
5,65,4,5,3,1
4,57,2,1,3,0
5,41,4,4,3,1
4,36,4,5,3,1
4,62,3,3,3,0
4,69,2,1,3,0
4,41,3,1,3,0
3,51,2,4,3,0
5,50,3,2,3,1
4,47,4,4,3,0
4,54,4,5,3,1
5,52,4,4,3,1
4,30,1,1,3,0
3,48,4,4,3,1
5,?,4,4,3,1
4,65,2,4,3,1
4,50,1,1,3,0
5,65,4,5,3,1
5,66,4,3,3,1
6,41,3,3,2,1
5,72,3,2,3,1
4,42,1,1,1,1
4,80,4,4,3,1
0,45,2,4,3,0
4,41,1,1,3,0
4,72,3,3,3,1
4,60,4,5,3,0
5,67,4,3,3,1
4,55,2,1,3,0
4,61,3,4,3,1
4,55,3,4,3,1
4,52,4,4,3,1
4,42,1,1,3,0
5,63,4,4,3,1
4,62,4,5,3,1
4,46,1,1,3,0
4,65,2,1,3,0
4,57,3,3,3,1
4,66,4,5,3,1
4,45,1,1,3,0
4,77,4,5,3,1
4,35,1,1,3,0
4,50,4,5,3,1
4,57,4,4,3,0
4,74,3,1,3,1
4,59,4,5,3,0
4,51,1,1,3,0
4,42,3,4,3,1
4,35,2,4,3,0
4,42,1,1,3,0
4,43,2,1,3,0
4,62,4,4,3,1
4,27,2,1,3,0
5,?,4,3,3,1
4,57,4,4,3,1
4,59,2,1,3,0
5,40,3,2,3,1
4,20,1,1,3,0
5,74,4,3,3,1
4,22,1,1,3,0
4,57,4,3,3,0
4,57,4,3,3,1
4,55,2,1,2,0
4,62,2,1,3,0
4,54,1,1,3,0
4,71,1,1,3,1
4,65,3,3,3,0
4,68,4,4,3,0
4,64,1,1,3,0
4,54,2,4,3,0
4,48,4,4,3,1
4,58,4,3,3,0
5,58,3,4,3,1
4,70,1,1,1,0
5,70,1,4,3,1
4,59,2,1,3,0
4,57,2,4,3,0
4,53,4,5,3,0
4,54,4,4,3,1
4,53,2,1,3,0
0,71,4,4,3,1
5,67,4,5,3,1
4,68,4,4,3,1
4,56,2,4,3,0
4,35,2,1,3,0
4,52,4,4,3,1
4,47,2,1,3,0
4,56,4,5,3,1
4,64,4,5,3,0
5,66,4,5,3,1
4,62,3,3,3,0"""

## Preprocessing

The following code simply loads this data into a list:

In [3]:
from collections.abc import Sequence
import csv

COL_BIRADS = 0
COL_AGE = 1
COL_SHAPE = 2
COL_MARGIN = 3
COL_DENSITY = 4
COL_SEVERITY = 5


# possible values for each column in the data
cancer_domains: Sequence[Sequence[int]] = [
    range(1, 7),  # BI-RADS
    [0, 45, 55, 75],  # age
    range(1, 5),  # shape
    range(1, 6),  # margin
    range(1, 5),  # density
    range(2),  # severity
]


def process_row(vals: Sequence[str]) -> Sequence[int] | None:
    # omit rows that have missing data
    if "?" in vals:
        return None
    birads, age, shape, margin, density, severity = map(int, vals)
    # discretize age
    if age >= 75:
        age = 75
    elif age >= 55:
        age = 55
    elif age >= 45:
        age = 45
    else:
        age = 0
    # fix typos in birads column
    if birads == 0:
        birads = 1
    elif birads == 55:
        birads = 5
    return birads, age, shape, margin, density, severity


cancer_data = [
    row2
    for row in csv.reader(csv_file.split("\n"))
    if (row2 := process_row(row)) is not None
]

print("data loaded (%i rows)" % len(cancer_data))

data loaded (830 rows)


## Counts

We first process the data by calculating the counts $n(c)$ and $n(a_i,c)$ for every class value $c$ and attribute value $a_i$. It is fine if you do not fully understand this code.

In [4]:
from collections import Counter
from collections.abc import Mapping
from dataclasses import dataclass


@dataclass
class Model:
    domains: Sequence[Sequence[int]]  # possible values
    c_column: int  # class column index
    a_columns: Sequence[int]  # attribute column indices
    n: int  # N, total number of observations
    nc: Mapping[int, int]  # n(c) as nc[c]
    nac: Mapping[int, Mapping[tuple[int, int], int]]  # n(a_i,c) as nac[i][a_i,c]
    s: float  # smoothing constant


def train_model(
    domains: Sequence[Sequence[int]],
    data: Sequence[Sequence[int]],
    c_column: int,
    a_columns: Sequence[int],
    s: float = 2.0,
) -> Model:
    assert all(all(val in vals for val, vals in zip(row, domains)) for row in data)
    nc = Counter(row[c_column] for row in data)
    nac = {
        a_column: Counter((row[a_column], row[c_column]) for row in data)
        for a_column in a_columns
    }
    return Model(
        domains=domains,
        c_column=c_column,
        a_columns=a_columns,
        n=len(data),
        nc=nc,
        nac=nac,
        s=s,
    )


cancer_model = train_model(
    domains=cancer_domains,
    data=cancer_data,
    c_column=COL_SEVERITY,
    a_columns=[COL_BIRADS, COL_AGE, COL_SHAPE, COL_MARGIN, COL_DENSITY],
)

We can now retrieve the counts very easily, as follows:

In [5]:
# number of patients without cancer (i.e. severity 0)
cancer_model.nc[0]

427

In [6]:
# number of patients with cancer (i.e. severity 1)
cancer_model.nc[1]

403

In [7]:
# number of severity 1 patients with BI-RADS assessment of 5
cancer_model.nac[COL_BIRADS][5, 1]

286

**Exercise** Find the number of patients in the dataset, aged over 75, that had no cancer.

In [8]:
# write your solution here

## Naive Bayes Classifier

To do the classification, we must make a decision based on the probability values, which we can derive from the counts $n(c)$ and $n(a_i,c)$. Let us first implement the usual naive Bayes classifier, and then move to the naive credal classifier. Recall that, for a given vector of attributes $a=(a_1,\dots,a_k)$,
we want to find the value for $c$ that maximizes
$$p(a,c)=p(c)\prod_{i=1}^k p(a_i|c)$$
In case of the naive Bayes classifier, we use the maximum likelihood estimates for the probabilities,
which happen to be given by the relative frequencies in the data:
$$p(c)=n(c)/N\qquad p(a_i|c)=n(a_i,c)/n(c)$$
Let's implement this:

In [9]:
from math import prod


def naive_bayes_prob_1(model: Model, test_row: Sequence[int], c: int) -> float:
    n = model.n
    nc = model.nc[c]
    nacs = [model.nac[a_column][test_row[a_column], c] for a_column in model.a_columns]
    pc = nc / n
    pacs = [nac / nc for nac in nacs]
    return pc * prod(pacs)

There is however one technical problem, which arises when some counts are zero.
Namely, if $N=0$ then the maximum likelihood estimate $p(c)=n(c)/N$ is undefined.
Similarly, if $n(c)=0$, then $p(a_i|c)=n(a_i,c)/n(c)$ is undefined.
This may result in a ``ZeroDivisionError`` in the code.

To handle this, we can instead use the Bayesian estimates of $p(c)$ and $p(a_i|c)$ under a Dirichlet prior,
which we saw in the lectures:
$$p(c)=\frac{n(c)+st(c)}{N+s}\qquad p(a_i|c)=\frac{n(a_i,c)+st(a_i,c)}{n(c)+st(c)}$$
where we must fix a value for $s>0$,
and the values for $t(c)$ and $t(a_i,c)$, bearing in mind the constraints
$$\sum_{c}t(c)=1\qquad \sum_{a_i}t(a_i,c)=t(c)$$
For $s$, $s=2$ is a sensible and common default.
The usual choice for the $t$ parameters is to pick these values symmetrically:
$$t(c)=1/|\mathcal{C}|\qquad t(a_i,c)=t(c)/|\mathcal{A}_i|$$
where $|\mathcal{C}|$ denotes the number of possible classes,
and $|\mathcal{A}_i|$ denotes the number of possible values of the $i$th attribute.

Let us implement this:

In [10]:
def naive_bayes_prob_2(model: Model, test_row: Sequence[int], c: int) -> float:
    tc: float = 1 / len(model.domains[model.c_column])
    tacs: Sequence[float] = [
        tc / len(model.domains[a_column]) for a_column in model.a_columns
    ]
    n = model.n + model.s
    nc = model.nc[c] + model.s * tc
    nacs = [
        model.nac[a_column][test_row[a_column], c] + model.s * tac
        for a_column, tac in zip(model.a_columns, tacs)
    ]
    # p(c)=(n(c)+s*t(c))/(N+s)
    pc = nc / n
    # p(a|c)=(n(a_i,c)+s*t(a_i,c))/(n(c)+s*t(c))
    pacs = [nac / nc for nac in nacs]
    return pc * prod(pacs)

We now have everything in place to implement the naive Bayes classifier:

In [11]:
def naive_bayes_outcome(
    model: Model, test_row: Sequence[int]
) -> Sequence[float | None]:
    c_domain = model.domains[model.c_column]
    probs = {c: naive_bayes_prob_2(model, test_row, c) for c in c_domain}
    max_prob = max(probs.values())
    c_test = test_row[model.c_column]
    return [1 if probs[c_test] + TOL >= max_prob else 0]

The test returns a sequence containing a single number, either ``1`` if the naive Bayes classifier is correct, or ``0`` if the naive Bayes classifier is wrong. Returning this number inside a sequence makes little sense now, but when we will consider more complex measures, using a sequence will be very handy to report multiple measures at once.

Let us test each row of the original data set. To report the accuracy of the classifier, we simply average all the values. Again, the implementation here is slightly more complex than need be at this point: we will also exclude all ``None`` outcomes. This will be very handy when we consider more complex measures later in the context of credal classification.

In [12]:
from statistics import mean


def mean_outcome(outcomes: Sequence[Sequence[float | None]]) -> Sequence[float | None]:
    def _mean(xs: Sequence[float | None]) -> float | None:
        xs2 = [x for x in xs if x is not None]
        return mean(xs2) if xs2 else None

    return list(map(_mean, zip(*outcomes)))


mean_outcome([naive_bayes_outcome(cancer_model, row) for row in cancer_data])

[0.8385542168674699]

As we can see, the classifier has an accuracy of about 84%.

## k-Fold Cross Validation

We should not use the same data used for training also for testing. Instead, we should split the data, train on one part, and test on the other. The next function abstracts this idea. It is ok if you do not fully understand the code.

In [13]:
from collections.abc import Callable


def kfcv_outcomes(
    # test(model, test_row) -> sequence of accuracy measures
    test: Callable[[Model, Sequence[int]], Sequence[float | None]],
    folds: int,
    domains: Sequence[Sequence[int]],
    data: Sequence[Sequence[int]],
    c_column: int,
    a_columns: Sequence[int],
    s: float = 2.0,
) -> Sequence[Sequence[float | None]]:
    outcomes = []
    for fold in range(folds):
        test_data = data[fold::folds]
        test_indices = range(fold, len(data), folds)
        train_data = [row for i, row in enumerate(data) if i not in test_indices]
        model = train_model(domains, train_data, c_column, a_columns, s)
        outcomes += [test(model, row) for row in test_data]
    return outcomes

Let's test it on the cancer data set.

In [14]:
mean_outcome(
    kfcv_outcomes(
        test=naive_bayes_outcome,
        folds=10,
        domains=cancer_domains,
        data=cancer_data,
        c_column=COL_SEVERITY,
        a_columns=[COL_BIRADS, COL_AGE, COL_SHAPE, COL_MARGIN, COL_DENSITY],
    )
)

[0.8337349397590361]

We now have a full naive Bayes classifier running properly. Let's now move to the fun part: credal classification.

## Naive Credal Classifier

To implement our naive credal classifier, all we need to do is modify the ``naive_bayes_outcome`` function a little bit so we check for interval maximality. However, for convenience, we use a conservative approximation for the interval which is very quick to evaluate. (In the project, you will derive the exact bounds and investigate the impact of this approximation.)
Specifically, we will use the following expressions that we derived in the lectures:
$$
    \underline{p}(c,a)
    \ge
    \underbrace{\frac{n(c)}{N+s}}_{\underline{p}(c)}
    \prod_{i=1}^k
    \underbrace{\frac{n(a_i,c)}{n(c) + s}}_{\underline{p}(a_i|c)}
\qquad
    \overline{p}(c,a)
    \le
    \underbrace{\frac{n(c)+s}{N+s}}_{\overline{p}(c)}
    \prod_{i=1}^k
    \underbrace{\frac{n(a_i,c) + s}{n(c) + s}}_{\overline{p}(a_i|c)}
$$

In [15]:
def naive_credal_prob(
    model: Model, test_row: Sequence[int], c: int
) -> tuple[float, float]:
    def interval(a: float, b: float) -> tuple[float, float]:
        return a / (b + model.s), (a + model.s) / (b + model.s)

    pc = interval(model.nc[c], model.n)
    pacs = [
        interval(model.nac[a_column][test_row[a_column], c], model.nc[c])
        for a_column in model.a_columns
    ]
    return pc[0] * prod(pac[0] for pac in pacs), pc[1] * prod(pac[1] for pac in pacs)


def naive_credal_outcome(
    model: Model, test_row: Sequence[int]
) -> Sequence[float | None]:
    c_domain = model.domains[model.c_column]
    probs = {c: naive_credal_prob(model, test_row, c) for c in c_domain}
    max_lowprob = max(low for low, upp in probs.values())
    set_size = sum(1 if probs[c][1] + TOL >= max_lowprob else 0 for c in c_domain)
    c_test = test_row[model.c_column]
    correct = probs[c_test][1] + TOL >= max_lowprob
    return [
        1 if correct else 0,  # accuracy
        (1 if correct else 0) if set_size == 1 else None,  # single accuracy
        (1 if correct else 0) if set_size != 1 else None,  # set accuracy
        set_size if set_size != 1 else None,  # indeterminate set size
        1 if set_size == 1 else 0,  # determinacy
    ]


mean_outcome(
    kfcv_outcomes(
        test=naive_credal_outcome,
        folds=10,
        domains=cancer_domains,
        data=cancer_data,
        c_column=COL_SEVERITY,
        a_columns=[COL_BIRADS, COL_AGE, COL_SHAPE, COL_MARGIN, COL_DENSITY],
    )
)

[0.8409638554216867, 0.8384332925336597, 1, 2, 0.9843373493975903]

**Exercise** Why is the set accuracy equal to 100%, and why is the indeterminate set size equal to exactly 2?

*Write your answer here.*

**Exercise** Is credal classification useful for this data? If yes, why, if no, why not?

*Write your answer here.*

## Additional Exercises

**Exercise** Discuss the impact of the sample size on both the naive Bayes and on the credal classifier. The code below may be helpful.

In [16]:
for sample_size in [15, 30, 60, 120, 240, 480, 830]:
    print(
        sample_size,
        mean_outcome(
            kfcv_outcomes(
                test=naive_bayes_outcome,
                folds=10,
                domains=cancer_domains,
                data=cancer_data[:sample_size],
                c_column=COL_SEVERITY,
                a_columns=[COL_BIRADS, COL_AGE, COL_SHAPE, COL_MARGIN, COL_DENSITY],
            )
        ),
        mean_outcome(
            kfcv_outcomes(
                test=naive_credal_outcome,
                folds=10,
                domains=cancer_domains,
                data=cancer_data[:sample_size],
                c_column=COL_SEVERITY,
                a_columns=[COL_BIRADS, COL_AGE, COL_SHAPE, COL_MARGIN, COL_DENSITY],
            )
        ),
    )

15 [0.6] [1, 1, 1, 2, 0.06666666666666667]
30 [0.8] [0.9333333333333333, 0.875, 1, 2, 0.5333333333333333]
60 [0.8] [0.8833333333333333, 0.8541666666666666, 1, 2, 0.8]
120 [0.8083333333333333] [0.85, 0.8301886792452831, 1, 2, 0.8833333333333333]
240 [0.8375] [0.8541666666666666, 0.8444444444444444, 1, 2, 0.9375]
480 [0.8458333333333333] [0.8541666666666666, 0.8494623655913979, 1, 2, 0.96875]
830 [0.8337349397590361] [0.8409638554216867, 0.8384332925336597, 1, 2, 0.9843373493975903]


**Exercise** Discuss the impact of $s$ on the credal classifier. What value of $s$ seems most appropriate to you? The code below may be helpful.

In [17]:
for s in [0.01, 1, 2, 10, 100, 1000]:
    print(
        s,
        mean_outcome(
            kfcv_outcomes(
                test=naive_credal_outcome,
                folds=10,
                domains=cancer_domains,
                data=cancer_data,
                c_column=COL_SEVERITY,
                a_columns=[COL_BIRADS, COL_AGE, COL_SHAPE, COL_MARGIN, COL_DENSITY],
                s=s,
            )
        ),
    )

0.01 [0.8337349397590361, 0.8331318016928658, 1, 2, 0.9963855421686747]
1 [0.8385542168674699, 0.8363858363858364, 1, 2, 0.9867469879518073]
2 [0.8409638554216867, 0.8384332925336597, 1, 2, 0.9843373493975903]
10 [0.8626506024096385, 0.8507853403141361, 1, 2, 0.9204819277108434]
100 [0.9771084337349397, 0.9351535836177475, 1, 2, 0.3530120481927711]
1000 [1, None, 1, 2, 0]


**Exercise** There is a specific value for $s$ under which the naive Bayes classifier (using maximum likelihood estimates for the probabilities) obtains as a special case of the credal classifier.

1. Identify this value.

2. Prove the claim using the formulae provided in the lectures.

3. Interpret this claim in the context of Wald's theorem which links frequentist inference and Bayesian inference.

*Write your answer here.*

**Exercise** It is customary, in classification, to simply learn the possible class values from the data.
However, in the code above, we explicitly state the possible values explicitly, through the ``domains`` parameter,
instead of deriving these values from the ``data`` parameter.
Explain why this is critical for the naive credal classifier, whilst it is less critical for the naive Bayes classifier.

Hint: What happens if a class value does not appear in the training data? The code below might be helpful.

In [18]:
model_1 = train_model(
    domains=[[0, 1], [0, 1]],  # c ∈ {0,1}, a ∈ {0,1}
    data=[[0, 0]],  # single obervation (c=0,a=0)
    c_column=0,
    a_columns=[1],
    s=2,
)
model_2 = train_model(
    domains=[[0, 1, 2], [0, 1]],  # c ∈ {0,1,2}, a ∈ {0,1}
    data=[[0, 0]],  # same single observation
    c_column=0,
    a_columns=[1],
    s=2,
)
# predict class for a=0
print("bayes 1:", naive_bayes_outcome(model=model_1, test_row=[0, 0]))
print("bayes 2:", naive_bayes_outcome(model=model_2, test_row=[0, 0]))
print("credal 1:", naive_credal_outcome(model=model_1, test_row=[0, 0]))
print("credal 2:", naive_credal_outcome(model=model_2, test_row=[0, 0]))

bayes 1: [1]
bayes 2: [1]
credal 1: [1, None, 1, 2, 0]
credal 2: [1, None, 1, 3, 0]


*Write your answer here.*

# Project

The aim of the project is
for you to learn more about machine learning with bounded probability.
It consists of 3 tasks:

1. Further explore the breast cancer dataset that we introduced in the lectures.

2. Derive some theoretical results to improve the probability bounds that we used in the lectures.
   Use this theoretical result to improve the credal classifier from the lectures.

3. Derive some theoretical results concerning
   robust Bayes maximality and robust Bayes admissibility for the credal classifier.
   Use this theoretical result to further improve the credal classifier from the lectures.

Each task is subdivided in very specific subtasks, to guide you along.
Most subtasks require some coding in Python.
However, the 2nd and 3rd task also have subtasks
that concern purely theoretical questions to be solved on pen and paper.

You may do all three tasks, or only a selection of them; this is up to you.

Throughout, as a baseline,
the suggested sample size is $N=100$ (i.e. use ``data=cancer_data[:100]``)
and $s=2$ (this is the default if not specified).
However, you are encouraged to play around with these values if you believe it is useful.

## BI-RADS Analysis

The ultimate goal of this data was to see if the doctor's BI-RADS assessment could be improved through image recognition (from which the shape, margin, and density attributes were derived).

1. Run the classifier to check whether or not the additional attributes can replace the BI-RADS assessment.

2. For their BI-RADS assessment, the doctor also has access to the images. Determine whether or not the classifier is good at predicting BI-RADS from the other attributes. Can you explain why the other attributes are good, or not so good, at predicting BI-RADS?

3. When using the credal classifier in the previous part to predict BI-RADS, you will now notice that the set accuracy is no longer 100% and that the indeterminate set size is no longer 2. Why is that? Interpret the new values.

4. Which of the attributes are most useful for classification? Should certain attributes be omitted?

In [19]:
# you can write your code here

## Exact Probability Bounds

The ``naive_credal_prob`` function in the code above
uses the approximate bounds for $\underline{p}(c,a)$ and $\overline{p}(c,a)$ that we saw in the lectures:
$$\underline{p}(c,a)=\inf_{t}\frac{n(c)+s t(c)}{N+s}\prod_{i=1}^k\frac{n(a_i,c)+s t(a_i,c)}{n(c) + s t(c)}\ge\frac{n(c)}{N+s}\prod_{i=1}^k\frac{n(a_i,c)}{n(c) + s}$$
$$\overline{p}(c,a)=\sup_{t}\frac{n(c)+s t(c)}{N+s}\prod_{i=1}^k\frac{n(a_i,c)+s t(a_i,c)}{n(c) + s t(c)}\le\frac{n(c)+s}{N+s}\prod_{i=1}^k\frac{n(a_i,c) + s}{n(c) + s}$$

1. This conservative approximation for the interval will impact the classifier in what way?

2. (The result of this exercise was proved by Zacch Lines in his Master thesis.)
Find an expression for the exact values of $\underline{p}(c,a)$ and $\overline{p}(c,a)$.
Hint: First show that
$$\underline{p}(c,a)=\inf_{t(c)\in[0,1]}\frac{n(c)+s t(c)}{N+s}\prod_{i=1}^k\frac{n(a_i,c)}{n(c) + s t(c)}$$
$$\overline{p}(c,a)=\sup_{t(c)\in[0,1]}\frac{n(c)+s t(c)}{N+s}\prod_{i=1}^k\frac{n(a_i,c)+s t(c)}{n(c) + s t(c)}$$
Hint: One (but only one) of the approximate bounds will turn out to be exact.

3. Implement your improved bounds in the code.

4. Verify the impact your improved bounds have on the classification for a range of sample sizes, different values for $s$, and attributes. When doing so, pay particular attention to the number of attributes. For instance, investigate the impact when predicting severity just from BI-RADS, as opposed to predicting severity from all available attributes.

In [20]:
# you can write your code here

## Robust Bayes Maximality

We used interval maximality in our credal classifier,
as it is very easy to implement.

1. How would you go about implementing robust Bayes maximality for the credal classifier?
   Identify the computations required.

2. Recall that
   $$p_t(c,a)=\frac{n(c)+s t(c)}{N+s}\prod_{i=1}^k\frac{n(a_i,c)+s t(a_i,c)}{n(c) + s t(c)}$$
   with $\sum_{c}t(c)=1$, $\sum_{a_i}t(a_i,c)=t(c)$, $t(c)>0$, and $t(a_i,c)>0$.
   Show that $p_t(c,a)>p_t(c',a)$
   for all $t$ whenever (see Zaffalon, 2001)
   $$\left(\frac{n(c')+s t(c')}{n(c)+s (1-t(c'))}\right)^{k-1}\left(\prod_{i=1}^k \frac{n(a_i,c)}{n(a_i,c')+s t(c')}\right)>1$$
   for all $t(c')$ such that $0<t(c')<1$.

3. (This is only for the most enthousiastic students; it is fine if you do not have time to complete this!)
   Try and implement a robust Bayes maximal version of the credal classifier using Zaffalon's formula for dominance.
   Hint:
   Unfortunately, the formula you derived in the previous part is not necessarily monotone.
   However, as a rough approximation, you may for instance discretize the problem
   by assuming $t(c')\in\{0.01, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99\}$
   and thereby brute force the calculation.
   You can also make use of the ``is_maximal`` function
   once you realize that this code can also work with objects other than vectors of real numbers
   (such as, say, classes).

4. Test and compare your new classifier with the interval maximality one.
   Try predicting severity, but also try predicting BI-RADS (as in one of the previous tasks).
   Be sure to also play around with the sample size.
   How has the new classifier improved? Under what circumstances do you see the most improvement?
   If you were not able to complete the coding from the previous part,
   discuss how you think the new classifier will compare against the interval maximal one,
   justifying your claims theoretically.

5. Can you tell why nobody has implemented a robust Bayes admissible version of the credal classifier?
   What would it take to do so?

6. When predicting severity (i.e. cancer or not), the robust Bayes maximal credal classifier (which you have implemented)
   theoretically coincides with the robust Bayes admissible credal classifier (which we will not implement).
   However, when predicting BI-RADS, this is no longer the case.
   Explain why.

Since the coding part is potentially not so straightforward,
some code to get started is provided below for your convenience.

In [21]:
# you can write your code here


def is_maximal(
    dominates: Callable[[int, int], bool],  # compares two classes
    cs: Sequence[int],  # sequence classes
) -> Sequence[bool]:
    def is_not_dominated(c1: int) -> bool:
        return all(not dominates(c2, c1) for c2 in cs)

    return [is_not_dominated(c1) for c1 in cs]


def naive_credal_outcome_2(
    model: Model, test_row: Sequence[int]
) -> Sequence[float | None]:

    def dominates(c1: int, c2: int) -> bool:
        return all(
            ... > 1 + TOL  # TODO use zaffalon's formula
            for t in [0.01, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99]
        )

    c_domain = model.domains[model.c_column]
    is_max_cs = is_maximal(dominates, c_domain)
    set_size = sum(is_max_cs)
    c_test = test_row[model.c_column]
    correct = is_max_cs[c_domain.index(c_test)]
    return [
        1 if correct else 0,  # accuracy
        (1 if correct else 0) if set_size == 1 else None,  # single accuracy
        (1 if correct else 0) if set_size != 1 else None,  # set accuracy
        set_size if set_size != 1 else None,  # indeterminate set size
        1 if set_size == 1 else 0,  # determinacy
    ]