Skip to content
Snippets Groups Projects
hmm_example.ipynb 10.9 KiB
Newer Older
  • Learn to ignore specific revisions
  • {
     "cells": [
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "# Hidden Markov Model Example"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "authors:<br>\n",
        "Jacob Schreiber [<a href=\"mailto:jmschreiber91@gmail.com\">jmschreiber91@gmail.com</a>]<br>\n",
        "Nicholas Farn [<a href=\"mailto:nicholasfarn@gmail.com\">nicholasfarn@gmail.com</a>]"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "A simple example highlighting how to build a model using states, add\n",
        "transitions, and then run the algorithms, including showing how training\n",
        "on a sequence improves the probability of the sequence."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 1,
    
       "metadata": {},
    
       "outputs": [],
       "source": [
        "import random\n",
        "from pomegranate import *\n",
        "\n",
        "random.seed(0)"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "First we will create the states of the model, one uniform and one normal."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 2,
    
       "metadata": {},
    
       "outputs": [],
       "source": [
        "state1 = State( UniformDistribution(0.0, 1.0), name=\"uniform\" )\n",
        "state2 = State( NormalDistribution(0, 2), name=\"normal\" )"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "We will then create the model by creating a HiddenMarkovModel instance. Then we will add the states."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 3,
    
       "metadata": {},
    
       "outputs": [],
       "source": [
        "model = HiddenMarkovModel( name=\"ExampleModel\" )\n",
        "model.add_state( state1 )\n",
        "model.add_state( state2 )"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Now we'll add the start states to the model."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 4,
    
       "metadata": {},
    
       "outputs": [],
       "source": [
        "model.add_transition( model.start, state1, 0.5 )\n",
        "model.add_transition( model.start, state2, 0.5 )"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "And the transition matrix."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 5,
    
       "metadata": {},
    
       "outputs": [],
       "source": [
        "model.add_transition( state1, state1, 0.4 )\n",
        "model.add_transition( state1, state2, 0.4 )\n",
        "model.add_transition( state2, state2, 0.4 )\n",
        "model.add_transition( state2, state1, 0.4 )"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Finally the ending states to the model."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 6,
    
       "metadata": {},
    
       "outputs": [],
       "source": [
        "model.add_transition( state1, model.end, 0.2 )\n",
        "model.add_transition( state2, model.end, 0.2 )"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "To finalize the model, we \"bake\" it."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 7,
    
       "metadata": {},
    
       "outputs": [],
       "source": [
        "model.bake()"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "New we'll create a sample sequence using our model."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 8,
    
       "metadata": {},
    
       "outputs": [
        {
         "name": "stdout",
         "output_type": "stream",
         "text": [
    
          "[-0.42128476]\n"
    
         ]
        }
       ],
       "source": [
        "sequence = model.sample()\n",
    
        "print(sequence)"
    
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Now we'll feed the sequence through a forward algorithm with our model."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 9,
    
       "metadata": {},
    
       "outputs": [
        {
         "name": "stdout",
         "output_type": "stream",
         "text": [
    
          "-3.9368559131089986\n"
    
        "print(model.forward( sequence )[ len(sequence), model.end_index ])"
    
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Next we'll do the same, except with a backwards algorithm."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 10,
    
       "metadata": {},
    
       "outputs": [
        {
         "name": "stdout",
         "output_type": "stream",
         "text": [
    
          "-3.9368559131089986\n"
    
        "print(model.backward( sequence )[0,model.start_index])"
    
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Then we'll feed the sequence again, through a forward-backward algorithm."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 11,
    
       "metadata": {},
    
       "outputs": [
        {
         "name": "stdout",
         "output_type": "stream",
         "text": [
    
          "[[0. 0. 0. 1.]\n",
          " [0. 0. 0. 0.]\n",
          " [1. 0. 0. 0.]\n",
          " [0. 0. 0. 0.]]\n",
          "[[  0. -inf]]\n"
    
         ]
        }
       ],
       "source": [
        "trans, ems = model.forward_backward( sequence )\n",
    
        "print(trans)\n",
        "print(ems)"
    
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Finally we'll train our model with our example sequence."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 12,
    
       "metadata": {},
    
       "outputs": [
        {
         "data": {
          "text/plain": [
    
           "{\n",
           "    \"class\" : \"HiddenMarkovModel\",\n",
           "    \"name\" : \"ExampleModel\",\n",
           "    \"start\" : {\n",
           "        \"class\" : \"State\",\n",
           "        \"distribution\" : null,\n",
           "        \"name\" : \"ExampleModel-start\",\n",
           "        \"weight\" : 1.0\n",
           "    },\n",
           "    \"end\" : {\n",
           "        \"class\" : \"State\",\n",
           "        \"distribution\" : null,\n",
           "        \"name\" : \"ExampleModel-end\",\n",
           "        \"weight\" : 1.0\n",
           "    },\n",
           "    \"states\" : [\n",
           "        {\n",
           "            \"class\" : \"State\",\n",
           "            \"distribution\" : {\n",
           "                \"class\" : \"Distribution\",\n",
           "                \"name\" : \"NormalDistribution\",\n",
           "                \"parameters\" : [\n",
           "                    NaN,\n",
           "                    NaN\n",
           "                ],\n",
           "                \"frozen\" : false\n",
           "            },\n",
           "            \"name\" : \"normal\",\n",
           "            \"weight\" : 1.0\n",
           "        },\n",
           "        {\n",
           "            \"class\" : \"State\",\n",
           "            \"distribution\" : {\n",
           "                \"class\" : \"Distribution\",\n",
           "                \"name\" : \"UniformDistribution\",\n",
           "                \"parameters\" : [\n",
           "                    0.0,\n",
           "                    1.0\n",
           "                ],\n",
           "                \"frozen\" : false\n",
           "            },\n",
           "            \"name\" : \"uniform\",\n",
           "            \"weight\" : 1.0\n",
           "        },\n",
           "        {\n",
           "            \"class\" : \"State\",\n",
           "            \"distribution\" : null,\n",
           "            \"name\" : \"ExampleModel-start\",\n",
           "            \"weight\" : 1.0\n",
           "        },\n",
           "        {\n",
           "            \"class\" : \"State\",\n",
           "            \"distribution\" : null,\n",
           "            \"name\" : \"ExampleModel-end\",\n",
           "            \"weight\" : 1.0\n",
           "        }\n",
           "    ],\n",
           "    \"end_index\" : 3,\n",
           "    \"start_index\" : 2,\n",
           "    \"silent_index\" : 2,\n",
           "    \"edges\" : [\n",
           "        [\n",
           "            2,\n",
           "            1,\n",
           "            0.0,\n",
           "            0.5,\n",
           "            null\n",
           "        ],\n",
           "        [\n",
           "            2,\n",
           "            0,\n",
           "            1.0,\n",
           "            0.5,\n",
           "            null\n",
           "        ],\n",
           "        [\n",
           "            1,\n",
           "            1,\n",
           "            0.4,\n",
           "            0.4,\n",
           "            null\n",
           "        ],\n",
           "        [\n",
           "            1,\n",
           "            0,\n",
           "            0.4,\n",
           "            0.4,\n",
           "            null\n",
           "        ],\n",
           "        [\n",
           "            1,\n",
           "            3,\n",
           "            0.20000000000000004,\n",
           "            0.2,\n",
           "            null\n",
           "        ],\n",
           "        [\n",
           "            0,\n",
           "            0,\n",
           "            0.0,\n",
           "            0.4,\n",
           "            null\n",
           "        ],\n",
           "        [\n",
           "            0,\n",
           "            1,\n",
           "            0.0,\n",
           "            0.4,\n",
           "            null\n",
           "        ],\n",
           "        [\n",
           "            0,\n",
           "            3,\n",
           "            1.0,\n",
           "            0.2,\n",
           "            null\n",
           "        ]\n",
           "    ],\n",
           "    \"distribution ties\" : []\n",
           "}"
    
          ]
         },
         "execution_count": 12,
         "metadata": {},
         "output_type": "execute_result"
        }
       ],
       "source": [
        "model.fit( [ sequence ] )"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Then repeat the algorithms we fed the sequence through before on our improved model."
       ]
      },
      {
       "cell_type": "code",
       "execution_count": 13,
       "metadata": {
        "scrolled": true
       },
       "outputs": [
        {
         "name": "stdout",
         "output_type": "stream",
         "text": [
          "Forward\n",
    
          "nan\n",
    
          "\n",
          "Backward\n",
    
          "nan\n",
    
          "[[nan nan  0. nan]\n",
          " [nan nan  0. nan]\n",
          " [nan nan  0.  0.]\n",
          " [ 0.  0.  0.  0.]]\n",
          "[[nan nan]]\n"
    
        "print(\"Forward\")\n",
        "print(model.forward( sequence )[ len(sequence), model.end_index ])\n",
        "print()\n",
        "print(\"Backward\")\n",
        "print(model.backward( sequence )[ 0,model.start_index ])\n",
        "print()\n",
    
        "trans, ems = model.forward_backward( sequence )\n",
    
        "print(trans)\n",
        "print(ems)"
    
       ]
      }
     ],
     "metadata": {
      "kernelspec": {
    
       "display_name": "Python 3",
    
       "language": "python",
    
       "name": "python3"
    
      },
      "language_info": {
       "codemirror_mode": {
        "name": "ipython",
    
        "version": 3
    
       },
       "file_extension": ".py",
       "mimetype": "text/x-python",
       "name": "python",
       "nbconvert_exporter": "python",
    
       "pygments_lexer": "ipython3",
       "version": "3.7.4"
      },
      "toc": {
       "base_numbering": 1,
       "nav_menu": {},
       "number_sections": true,
       "sideBar": true,
       "skip_h1_title": false,
       "title_cell": "Table of Contents",
       "title_sidebar": "Contents",
       "toc_cell": false,
       "toc_position": {},
       "toc_section_display": true,
       "toc_window_display": false
    
     "nbformat_minor": 1