Adding a Problem

In general, a problem file has 3 components: train-and-test-data, instructions, error-function, and -main.

  1. To add a new problem, you need training and test data. For Problem Synthesis Benchmark Problems (PSB2), you can fetch datasets using psb2.core/fetch-examples.
  2. (defn fetch-examples
      "Fetches and returns training and test data from a PSB2 problem.
       Returns a map of the form {:train training-examples :test testing-examples}
       where training-examples and testing-examples are lists of training and test
       data. The elements of these lists are maps of the form:
       {:input1 first-input :input2 second-input ... :output1 first-output ...}
       The training examples will include all hard-coded edge cases included in the suite,
       along with enough random cases to include `n-train` cases.
       Note that this function loads large datasets and can be slow, 30-120 seconds.
       Parameters:
         `datasets-directory` - Location of the PSB2 datasets as downloaded from https://zenodo.org/record/4678739
         `problem-name` - Name of the PSB2 problem, lowercase and seperated by dashes.
             - Ex: indices-of-substring
         `n-train` - Number of training cases to return
         `n-test` - Number of test cases to return"
      [datasets-directory problem-name n-train n-test]
    
    
  3. Define the possible Push instructions to be used to create plushys. It should be a non-lazy list of instructions from push/instructions, input instructions, close, and constants (including functions that produce constants).
  4. Define an error function that will evaluate plushys and add :behaviors parsed-outputs, :errors, and :total-error to the individual
  5. Define the function -main with a map of default arguments.

Example of a Problem

(ns propeller.problems.PSB2.solve-boolean
  (:require [psb2.core :as psb2]
            [propeller.genome :as genome]
            [propeller.push.interpreter :as interpreter]
            [propeller.utils :as utils]
            [propeller.push.instructions :refer [get-stack-instructions]]
            [propeller.push.state :as state]
            [propeller.gp :as gp]
            #?(:cljs [cljs.reader :refer [read-string]])))

; ===========  PROBLEM DESCRIPTION  ================================
; SOLVE BOOLEAN from PSB2
; Given a string representing a Boolean
; expression consisting of T, F, |, and &, evaluate it and return
; the resulting Boolean.
;
; Source: https://arxiv.org/pdf/2106.06086.pdf
; ==================================================================

(def train-and-test-data (psb2/fetch-examples "data" "solve-boolean" 200 2000))

(def instructions
  (utils/not-lazy
    (concat
      ;;; stack-specific instructions
      (get-stack-instructions #{:exec :integer :boolean :char :string :print})
      ;;; input instructions
      (list :in1)
      ;;; close
      (list 'close)
      ;;; ERCs (constants)
      (list true false \t \f \& \|))))

(defn error-function
  [argmap data individual]
  (let [program (genome/plushy->push (:plushy individual) argmap)
        inputs (map (fn [i] (get i :input1)) data)
        correct-outputs (map (fn [i] (get i :output1)) data)
        outputs (map (fn [input]
                       (state/peek-stack
                         (interpreter/interpret-program
                           program
                           (assoc state/empty-state :input {:in1 input})
                           (:step-limit argmap))
                         :boolean))
                     inputs)
        parsed-outputs (map (fn [output]
                              (try (read-string output)
                                   #?(:clj  (catch Exception e 1000.0)
                                      :cljs (catch js/Error. e 1000.0))))
                            outputs)
        errors (map (fn [correct-output output]
                      (if (= output :no-stack-item)
                        10000
                        (if (= correct-output output)
                           0
                           1)))
                    correct-outputs
                    parsed-outputs)]
    (assoc individual
      :behaviors parsed-outputs
      :errors errors
      :total-error #?(:clj  (apply +' errors)
                      :cljs (apply + errors)))))

(defn -main
  "Runs propel-gp, giving it a map of arguments."
  [& args]
  (gp/gp
    (merge
      {:instructions            instructions
       :error-function          error-function
       :training-data           (:train train-and-test-data)
       :testing-data            (:test train-and-test-data)
       :max-generations         300
       :population-size         1000
       :max-initial-plushy-size 250
       :step-limit              2000
       :parent-selection        :lexicase
       :tournament-size         5
       :umad-rate               0.1
       :variation               {:umad 1.0 :crossover 0.0}
       :elitism                 false}
      (apply hash-map (map #(if (string? %) (read-string %) %) args)))))