{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Example for evolutionary regression with parametrized nodes\n\nExample demonstrating the use of Cartesian genetic programming for a\nregression task that requires fine tuning of constants in parametrized\nnodes. This is achieved by introducing a new node, \"ParametrizedAdd\"\nwhich produces a scaled and shifted version of the sum of its inputs.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# The docopt str is added explicitly to ensure compatibility with\n# sphinx-gallery.\ndocopt_str = \"\"\"\n  Usage:\n    example_parametrized_nodes.py [--max-generations=<N>]\n\n  Options:\n    -h --help\n    --max-generations=<N>  Maximum number of generations [default: 500]\n\"\"\"\n\nimport functools\nimport math\n\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport scipy.constants\nimport torch\nfrom docopt import docopt\n\nimport cgp\n\nargs = docopt(docopt_str)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We first define a new node that adds the values of its two inputs then scales and\nfinally shifts the result. The scale (\"w\") and shift factors (\"b\")\nare parameters that are adapted by local search. We need to define\nthe arity of the node, callables for the initial values for the\nparameters and the operation of the node as a string. In this string\nparameters are enclosed in angle brackets, inputs are denoted by \"x_i\"\nwith i representing their corresponding index.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "class ParametrizedAdd(cgp.OperatorNode):\n    \"\"\"A node that adds its two inputs.\n\n    The result of addition is scaled by w and shifted by b. Both these\n    parameters can be adapted via local search are passed on from\n    parents to their offspring.\n\n    \"\"\"\n\n    _arity = 2\n    _initial_values = {\"<w>\": lambda: 1.0, \"<b>\": lambda: 0.0}\n    _def_output = \"<w> * (x_0 + x_1) + <b>\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We define a target function which contains numerical constants that\nare not available as constants for the search and need to be found\nby local search on parameterized nodes.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def f_target(x):\n    return math.pi * (x[:, 0] + x[:, 1]) + math.e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Then we define a differentiable(!) inner objective function for the\nevolution. This function accepts a torch class as a parameter. It\nreturns the mean-squared error between the output of the forward\npass of this class and the target function evaluated on a set of\nrandom points. This inner objective is then used by actual objective\nfunction to determine the fitness of the individual.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def inner_objective(f, seed):\n    torch.manual_seed(seed)\n    batch_size = 500\n    x = torch.DoubleTensor(batch_size, 2).uniform_(-5, 5)\n    y = f(x)\n    return torch.nn.MSELoss()(f_target(x), y[:, 0])\n\n\ndef objective(individual, seed):\n    if not individual.fitness_is_None():\n        return individual\n\n    f = individual.to_torch()\n    loss = inner_objective(f, seed)\n    individual.fitness = -loss.item()\n\n    return individual"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Next, we define the parameters for the population, the genome of\nindividuals, the evolutionary algorithm, and the local search. Note\nthat we add the custom node defined above as a primitive.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "population_params = {\"n_parents\": 1, \"seed\": 818821}\n\ngenome_params = {\n    \"n_inputs\": 2,\n    \"n_outputs\": 1,\n    \"n_columns\": 5,\n    \"n_rows\": 1,\n    \"levels_back\": None,\n    \"primitives\": (ParametrizedAdd, cgp.Add, cgp.Sub, cgp.Mul),\n}\n\nea_params = {\"n_offsprings\": 4, \"tournament_size\": 1, \"mutation_rate\": 0.04, \"n_processes\": 2}\n\nevolve_params = {\"max_generations\": int(args[\"--max-generations\"]), \"termination_fitness\": 0.0}\n\nlocal_search_params = {\"lr\": 1e-3, \"gradient_steps\": 9}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We then create a Population instance and instantiate the local search\nand evolutionary algorithm.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "pop = cgp.Population(**population_params, genome_params=genome_params)\n\nlocal_search = functools.partial(\n    cgp.local_search.gradient_based,\n    objective=functools.partial(inner_objective, seed=population_params[\"seed\"]),\n    **local_search_params,\n)\n\nea = cgp.ea.MuPlusLambda(**ea_params, local_search=local_search)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We define a recording callback closure for bookkeeping of the progression of the evolution.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "history = {}\nhistory[\"fitness_champion\"] = []\nhistory[\"expr_champion\"] = []\n\n\ndef recording_callback(pop):\n    history[\"fitness_champion\"].append(pop.champion.fitness)\n    history[\"expr_champion\"].append(pop.champion.to_sympy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We fix the seed for the objective function to make sure results are\ncomparable across individuals and, finally, we call the `evolve`\nmethod to perform the evolutionary search.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "obj = functools.partial(objective, seed=population_params[\"seed\"])\n\ncgp.evolve(pop, obj, ea, **evolve_params, print_progress=True, callback=recording_callback)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "After the evolutionary search has ended, we print the expression\nwith the highest fitness and plot the progression of the search.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(f\"Final expression {pop.champion.to_sympy()[0]} with fitness {pop.champion.fitness}\")\n\nprint(\"Best performing expression per generation (for fitness increase > 0.5):\")\nold_fitness = -np.inf\nfor i, (fitness, expr) in enumerate(zip(history[\"fitness_champion\"], history[\"expr_champion\"])):\n    delta_fitness = fitness - old_fitness\n    if delta_fitness > 0.5:\n        print(f\"{i:3d}: {fitness}, {expr}\")\n        old_fitness = fitness\nprint(f\"{i:3d}: {fitness}, {expr}\")\n\nwidth = 9.0\n\nfig = plt.figure(figsize=(width, width / scipy.constants.golden))\n\nax_fitness = fig.add_subplot(111)\nax_fitness.set_xlabel(\"Generation\")\nax_fitness.set_ylabel(\"Fitness\")\nax_fitness.set_yscale(\"symlog\")\n\nax_fitness.axhline(0.0, color=\"k\")\nax_fitness.plot(history[\"fitness_champion\"], lw=2)\n\nplt.savefig(\"example_parametrized_nodes.pdf\", dpi=300)"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}