hmm_example.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hidden Markov Model Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "authors:<br>\n",
    "Jacob Schreiber [<a href=\"mailto:jmschreiber91@gmail.com\">jmschreiber91@gmail.com</a>]<br>\n",
    "Nicholas Farn [<a href=\"mailto:nicholasfarn@gmail.com\">nicholasfarn@gmail.com</a>]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A simple example highlighting how to build a model using states, add\n",
    "transitions, and then run the algorithms, including showing how training\n",
    "on a sequence improves the probability of the sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "from pomegranate import *\n",
    "\n",
    "random.seed(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First we will create the states of the model, one uniform and one normal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "state1 = State( UniformDistribution(0.0, 1.0), name=\"uniform\" )\n",
    "state2 = State( NormalDistribution(0, 2), name=\"normal\" )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will then create the model by creating a HiddenMarkovModel instance. Then we will add the states."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = HiddenMarkovModel( name=\"ExampleModel\" )\n",
    "model.add_state( state1 )\n",
    "model.add_state( state2 )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we'll add the start states to the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.add_transition( model.start, state1, 0.5 )\n",
    "model.add_transition( model.start, state2, 0.5 )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And the transition matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.add_transition( state1, state1, 0.4 )\n",
    "model.add_transition( state1, state2, 0.4 )\n",
    "model.add_transition( state2, state2, 0.4 )\n",
    "model.add_transition( state2, state1, 0.4 )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally the ending states to the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.add_transition( state1, model.end, 0.2 )\n",
    "model.add_transition( state2, model.end, 0.2 )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To finalize the model, we \"bake\" it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.bake()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "New we'll create a sample sequence using our model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.42128476]\n"
     ]
    }
   ],
   "source": [
    "sequence = model.sample()\n",
    "print(sequence)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we'll feed the sequence through a forward algorithm with our model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-3.9368559131089986\n"
     ]
    }
   ],
   "source": [
    "print(model.forward( sequence )[ len(sequence), model.end_index ])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we'll do the same, except with a backwards algorithm."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-3.9368559131089986\n"
     ]
    }
   ],
   "source": [
    "print(model.backward( sequence )[0,model.start_index])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we'll feed the sequence again, through a forward-backward algorithm."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[0. 0. 0. 1.]\n",
      " [0. 0. 0. 0.]\n",
      " [1. 0. 0. 0.]\n",
      " [0. 0. 0. 0.]]\n",
      "[[  0. -inf]]\n"
     ]
    }
   ],
   "source": [
    "trans, ems = model.forward_backward( sequence )\n",
    "print(trans)\n",
    "print(ems)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally we'll train our model with our example sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{\n",
       "    \"class\" : \"HiddenMarkovModel\",\n",
       "    \"name\" : \"ExampleModel\",\n",
       "    \"start\" : {\n",
       "        \"class\" : \"State\",\n",
       "        \"distribution\" : null,\n",
       "        \"name\" : \"ExampleModel-start\",\n",
       "        \"weight\" : 1.0\n",
       "    },\n",
       "    \"end\" : {\n",
       "        \"class\" : \"State\",\n",
       "        \"distribution\" : null,\n",
       "        \"name\" : \"ExampleModel-end\",\n",
       "        \"weight\" : 1.0\n",
       "    },\n",
       "    \"states\" : [\n",
       "        {\n",
       "            \"class\" : \"State\",\n",
       "            \"distribution\" : {\n",
       "                \"class\" : \"Distribution\",\n",
       "                \"name\" : \"NormalDistribution\",\n",
       "                \"parameters\" : [\n",
       "                    NaN,\n",
       "                    NaN\n",
       "                ],\n",
       "                \"frozen\" : false\n",
       "            },\n",
       "            \"name\" : \"normal\",\n",
       "            \"weight\" : 1.0\n",
       "        },\n",
       "        {\n",
       "            \"class\" : \"State\",\n",
       "            \"distribution\" : {\n",
       "                \"class\" : \"Distribution\",\n",
       "                \"name\" : \"UniformDistribution\",\n",
       "                \"parameters\" : [\n",
       "                    0.0,\n",
       "                    1.0\n",
       "                ],\n",
       "                \"frozen\" : false\n",
       "            },\n",
       "            \"name\" : \"uniform\",\n",
       "            \"weight\" : 1.0\n",
       "        },\n",
       "        {\n",
       "            \"class\" : \"State\",\n",
       "            \"distribution\" : null,\n",
       "            \"name\" : \"ExampleModel-start\",\n",
       "            \"weight\" : 1.0\n",
       "        },\n",
       "        {\n",
       "            \"class\" : \"State\",\n",
       "            \"distribution\" : null,\n",
       "            \"name\" : \"ExampleModel-end\",\n",
       "            \"weight\" : 1.0\n",
       "        }\n",
       "    ],\n",
       "    \"end_index\" : 3,\n",
       "    \"start_index\" : 2,\n",
       "    \"silent_index\" : 2,\n",
       "    \"edges\" : [\n",
       "        [\n",
       "            2,\n",
       "            1,\n",
       "            0.0,\n",
       "            0.5,\n",
       "            null\n",
       "        ],\n",
       "        [\n",
       "            2,\n",
       "            0,\n",
       "            1.0,\n",
       "            0.5,\n",
       "            null\n",
       "        ],\n",
       "        [\n",
       "            1,\n",
       "            1,\n",
       "            0.4,\n",
       "            0.4,\n",
       "            null\n",
       "        ],\n",
       "        [\n",
       "            1,\n",
       "            0,\n",
       "            0.4,\n",
       "            0.4,\n",
       "            null\n",
       "        ],\n",
       "        [\n",
       "            1,\n",
       "            3,\n",
       "            0.20000000000000004,\n",
       "            0.2,\n",
       "            null\n",
       "        ],\n",
       "        [\n",
       "            0,\n",
       "            0,\n",
       "            0.0,\n",
       "            0.4,\n",
       "            null\n",
       "        ],\n",
       "        [\n",
       "            0,\n",
       "            1,\n",
       "            0.0,\n",
       "            0.4,\n",
       "            null\n",
       "        ],\n",
       "        [\n",
       "            0,\n",
       "            3,\n",
       "            1.0,\n",
       "            0.2,\n",
       "            null\n",
       "        ]\n",
       "    ],\n",
       "    \"distribution ties\" : []\n",
       "}"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.fit( [ sequence ] )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then repeat the algorithms we fed the sequence through before on our improved model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Forward\n",
      "nan\n",
      "\n",
      "Backward\n",
      "nan\n",
      "\n",
      "[[nan nan  0. nan]\n",
      " [nan nan  0. nan]\n",
      " [nan nan  0.  0.]\n",
      " [ 0.  0.  0.  0.]]\n",
      "[[nan nan]]\n"
     ]
    }
   ],
   "source": [
    "print(\"Forward\")\n",
    "print(model.forward( sequence )[ len(sequence), model.end_index ])\n",
    "print()\n",
    "print(\"Backward\")\n",
    "print(model.backward( sequence )[ 0,model.start_index ])\n",
    "print()\n",
    "trans, ems = model.forward_backward( sequence )\n",
    "print(trans)\n",
    "print(ems)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}