Research Article

Human-level concept learning through probabilistic program induction

See allHide authors and affiliations

Science  11 Dec 2015:
Vol. 350, Issue 6266, pp. 1332-1338
DOI: 10.1126/science.aab3050

You are currently viewing the figures only.

View Full Text

Log in to view the full text

Log in through your institution

Log in through your institution

  1. Fig. 1 People can learn rich concepts from limited data.

    (A and B) A single example of a new concept (red boxes) can be enough information to support the (i) classification of new examples, (ii) generation of new examples, (iii) parsing an object into parts and relations (parts segmented by color), and (iv) generation of new concepts from related concepts. [Image credit for (A), iv, bottom: With permission from Glenn Roberts and Motorcycle Mojo Magazine]

  2. Fig. 2 Simple visual concepts for comparing human and machine learning.

    525 (out of 1623) character concepts, shown with one example each.

  3. Fig. 3 A generative model of handwritten characters.

    (A) New types are generated by choosing primitive actions (color coded) from a library (i), combining these subparts (ii) to make parts (iii), and combining parts with relations to define simple programs (iv). New tokens are generated by running these programs (v), which are then rendered as raw data (vi). (B) Pseudocode for generating new types ψ and new token images I (m) for m = 1, ..., M. The function f (·, ·) transforms a subpart sequence and start location into a trajectory.

  4. Fig. 4 Inferring motor programs from images.

    Parts are distinguished by color, with a colored dot indicating the beginning of a stroke and an arrowhead indicating the end. (A) The top row shows the five best programs discovered for an image along with their log-probability scores (Eq. 1). Subpart breaks are shown as black dots. For classification, each program was refit to three new test images (left in image triplets), and the best-fitting parse (top right) is shown with its image reconstruction (bottom right) and classification score (log posterior predictive probability). The correctly matching test item receives a much higher classification score and is also more cleanly reconstructed by the best programs induced from the training item. (B) Nine human drawings of three characters (left) are shown with their ground truth parses (middle) and best model parses (right).

  5. Fig. 5 Generating new exemplars.

    Humans and machines were given an image of a novel character (top) and asked to produce new exemplars. The nine-character grids in each pair that were generated by a machine are (by row) 1, 2; 2, 1; 1, 1.

  6. Fig. 6 Human and machine performance was compared on (A) one-shot classification and (B) four generative tasks.

    The creative outputs for humans and models were compared by the percent of human judges to correctly identify the machine. Ideal performance is 50%, where the machine is perfectly confusable with humans in these two-alternative forced choice tasks (pink dotted line). Bars show the mean ± SEM [N = 10 alphabets in (A)]. The no learning-to-learn lesion is applied at different levels (bars left to right): (A) token; (B) token, stroke order, type, and type.

  7. Fig. 7 Generating new concepts.

    (A) Humans and machines were given a novel alphabet (i) and asked to produce new characters for that alphabet. New machine-generated characters are shown in (ii). Human and machine productions can be compared in (iii). The four-character grids in each pair that were generated by the machine are (by row) 1, 1, 2; 1, 2. (B) Humans and machines produced new characters without a reference alphabet. The grids that were generated by a machine are 2; 1; 1; 2.