propeller.downsample
FIXME: write docs
assign-indices-to-data
(assign-indices-to-data training-data argmap)
assigns an index to each training case in order to differentiate them when downsampling
convert-to-elite-error
(convert-to-elite-error errors)
converts a set of errors into a list where all the elite errors are replaced with 0s so that we can use it in the selection of down-samples with elite/not-elite selection
convert-to-soft-error
(convert-to-soft-error errors delta)
FIXME: write docs
get-distance-between-cases
(get-distance-between-cases error-lists case-index-1 case-index-2)
returns the distance between two cases given a list of individual error vectors, and the index these cases exist in the error vector. Only makes the distinction between zero and nonzero errors
initialize-case-distances
(initialize-case-distances {:keys [training-data population-size], :as argmap})
FIXME: write docs
merge-map-lists-at-index
(merge-map-lists-at-index big-list small-list)
merges two lists of maps, replacing the maps in the big list with their corresponding (based on index) maps in the small list
replace-close-zero-with-zero
(replace-close-zero-with-zero coll delta)
replaces values within a delta of zero with zero, used for regression problems
replace-mins-with-zero
(replace-mins-with-zero coll)
replaces the minimum value(s) in a list with zero
select-downsample-maxmin
(select-downsample-maxmin training-data {:keys [downsample-rate]})
selects a downsample that has it’s cases maximally far away by sequentially adding cases to the downsample that have their closest case maximally far away
select-downsample-maxmin-adaptive
(select-downsample-maxmin-adaptive training-data {:keys [case-delta]})
selects a downsample that has it’s cases maximally far away by sequentially adding cases to the downsample that have their closest case maximally far away automatically stops when the maximum minimum distance is below delta
select-downsample-random
(select-downsample-random training-data {:keys [downsample-rate]})
Selects a downsample from the training cases and returns it
update-at-indices
(update-at-indices big-vec small-vec indices)
merges two vectors at the indices provided by a third vector
update-case-distances
(update-case-distances evaluated-pop ds-data training-data ids-type)
(update-case-distances evaluated-pop ds-data training-data ids-type solution-threshold)
updates the case distance field of training-data list, should be called after evaluation of individuals evaluated-pop should be a list of individuals that all have the :errors field with a list of this individuals performance on the each case in the training-data, in order. ids-type is :elite to use elite/not-elite, :soft to consider near solves, and :solved to use solve/not-solved