{"id":171,"date":"2017-02-13T17:58:39","date_gmt":"2017-02-13T17:58:39","guid":{"rendered":"http:\/\/sag.art.uniroma2.it\/kelp_wordpress\/?page_id=171"},"modified":"2017-04-06T15:52:53","modified_gmt":"2017-04-06T15:52:53","slug":"hello-linear-learning","status":"publish","type":"page","link":"http:\/\/www.kelp-ml.org\/?page_id=171","title":{"rendered":"Hello (linear) Learning!"},"content":{"rendered":"<p>Let&#8217;s start with a very simple classification example based on a linear version of the Passive Aggressive algorithm (<a href=\"http:\/\/www.kelp-ml.org\/kelp-javadoc\/current-version\/it\/uniroma2\/sag\/kelp\/learningalgorithm\/classification\/passiveaggressive\/LinearPassiveAggressiveClassification.html\" target=\"_blank\">LinearPassiveAggressiveClassification<\/a>). The full code of this example can be found in the <a title=\"GitHub\" href=\"https:\/\/github.com\" target=\"_blank\">GitHub<\/a> repository <a title=\"kelp-examples repository\" href=\"https:\/\/github.com\/SAG-KeLP\/kelp-full\" target=\"_blank\">kelp-full<\/a>, in particular in the source file <em><a href=\"https:\/\/github.com\/SAG-KeLP\/kelp-full\/blob\/master\/src\/main\/java\/it\/uniroma2\/sag\/kelp\/examples\/main\/HelloLearning.java\" target=\"_blank\">HelloLearning.java<\/a><\/em>.<\/p>\n<p>Dataset used here is the same of the <a title=\"SvmLight\" href=\"http:\/\/svmlight.joachims.org\" target=\"_blank\">svmlight<\/a> page; each example is only modified to be readable by <strong>KeLP<\/strong>. In fact, a single row in <strong>KeLP<\/strong>\u00a0must indicate what kind of vectors your are using, <a href=\"http:\/\/www.kelp-ml.org\/kelp-javadoc\/current-version\/it\/uniroma2\/sag\/kelp\/data\/representation\/vector\/SparseVector.html\" target=\"_blank\">Sparse<\/a> or <a href=\"http:\/\/www.kelp-ml.org\/kelp-javadoc\/current-version\/it\/uniroma2\/sag\/kelp\/data\/representation\/vector\/DenseVector.html\" target=\"_blank\">Dense<\/a>. In the <a title=\"SvmLight\" href=\"http:\/\/svmlight.joachims.org\" target=\"_blank\">svmlight<\/a>\u00a0dataset there are sparse vectors, so if you open the train.dat and test.dat files you can notice that each vector is enclosed in BeginVector (|BV|) and EndVector (|EV|) tags.<\/p>\n<p>The classification task consists in classifying an example with respect to the &#8220;+1&#8221; and &#8220;-1&#8221; classes. The dataset is thus composed by examples of such classes:<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/SAG-KeLP\/kelp-full\/blob\/master\/src\/main\/resources\/hellolearning\/train.klp\">Training set<\/a> (2000 examples, 1000 of class &#8220;+1&#8221; (positive), and 1000 of class &#8220;-1&#8221; (negative))<\/li>\n<li><a href=\"https:\/\/github.com\/SAG-KeLP\/kelp-full\/blob\/master\/src\/main\/resources\/hellolearning\/test.klp\">Test set<\/a>\u00a0(600 examples, 300 of class &#8220;+1&#8221; (positive), and 300 of class &#8220;-1&#8221; (negative))<\/li>\n<\/ul>\n<p>Let&#8217;s start doing some Java code.<\/p>\n<p>First of all, we need to <strong>load dataset<\/strong> in memory and define what is the positive class of the classification problem.<\/p>\n<pre class=\"lang:java decode:true\" title=\"Loaddataset\">\/\/ Read a dataset into a trainingSet variable\r\nSimpleDataset trainingSet = new SimpleDataset();\r\ntrainingSet.populate(\"src\/main\/resources\/hellolearning\/train.klp\");\r\n\/\/ Read a dataset into a test variable\r\nSimpleDataset testSet = new SimpleDataset();\r\ntestSet.populate(\"src\/main\/resources\/hellolearning\/test.klp\");\r\n\/\/ define the positive class\r\nStringLabel positiveClass = new StringLabel(\"+1\");<\/pre>\n<p>If you want, you can print some <strong>statistics<\/strong> about dataset through some useful built-in methods.<\/p>\n<pre class=\"lang:java decode:true\" title=\"stat\">\/\/ print some statistics\r\nSystem.out.println(\"Training set statistics\");\r\nSystem.out.print(\"Examples number \");\r\nSystem.out.println(trainingSet.getNumberOfExamples());\r\nSystem.out.print(\"Positive examples \");\r\nSystem.out.println(trainingSet.getNumberOfPositiveExamples(positiveClass));\r\nSystem.out.print(\"Negative examples \");\r\nSystem.out.println(trainingSet.getNumberOfNegativeExamples(positiveClass));\r\nSystem.out.println(\"Test set statistics\");\r\nSystem.out.print(\"Examples number \");\r\nSystem.out.println(testSet.getNumberOfExamples());\r\nSystem.out.print(\"Positive examples \");\r\nSystem.out.println(testSet.getNumberOfPositiveExamples(positiveClass));\r\nSystem.out.print(\"Negative examples \");\r\nSystem.out.println(testSet.getNumberOfNegativeExamples(positiveClass));<\/pre>\n<p>Then, instantiate a new <strong>Passive Aggressive algorithm<\/strong> and set some <strong>parameter<\/strong> on it.<\/p>\n<pre class=\"lang:java decode:true\" title=\"pa\">\/\/ instantiate a passive aggressive algorithm\r\nLinearPassiveAggressive passiveAggressiveAlgorithm = new LinearPassiveAggressive();\r\n\/\/ use the first (and only here) representation\r\npassiveAggressiveAlgorithm.setRepresentation(\"0\");\r\n\/\/ indicate to the learner what is the positive class\r\npassiveAggressiveAlgorithm.setLabel(positiveClass);\r\n\/\/ set an aggressiveness parameter\r\npassiveAggressiveAlgorithm.setAggressiveness(0.01f);<\/pre>\n<p><strong>Learn<\/strong> a model on the trainingSet obtaining a <strong>Classifier<\/strong><\/p>\n<pre class=\"lang:java decode:true\" title=\"Learn\">\/\/ learn and get the prediction function\r\nClassifier f = passiveAggressiveAlgorithm.learn(trainingSet);<\/pre>\n<p>Finally, we <strong>classify<\/strong> each example in the test set and compute some performance measure.<\/p>\n<pre class=\"lang:java decode:true\" title=\"classify\">int correct=0;\r\nfor (Example e : testSet.getExamples()) {\r\n    ClassificationOutput p = f.predict(testSet.getNextExample());\r\n    if (p.getScore(positiveClass) &gt; 0 &amp;&amp; e.isExampleOf(positiveClass))\r\n        correct++;\r\n    else if (p.getScore(positiveClass) &lt; 0 &amp;&amp; !e.isExampleOf(positiveClass))\r\n        correct++;\r\n}\r\nSystem.out.println(\"Accuracy: \" + ((float)correct\/(float)testSet.getNumberOfExamples()));<\/pre>\n<p>At the end of the training the program of the HelloLearning.java file will output the 97.16% accuracy.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Let&#8217;s start with a very simple classification example based on a linear version of the Passive Aggressive algorithm (LinearPassiveAggressiveClassification). The full code of this example can be found in the GitHub repository kelp-full, in particular in the source file HelloLearning.java. Dataset used here is the same of the svmlight page; each example is only modified <a href=\"http:\/\/www.kelp-ml.org\/?page_id=171\" rel=\"nofollow\"><span class=\"sr-only\">Read more about Hello (linear) Learning!<\/span>[&hellip;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":27,"menu_order":6,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=\/wp\/v2\/pages\/171"}],"collection":[{"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=171"}],"version-history":[{"count":12,"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=\/wp\/v2\/pages\/171\/revisions"}],"predecessor-version":[{"id":1013,"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=\/wp\/v2\/pages\/171\/revisions\/1013"}],"up":[{"embeddable":true,"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=\/wp\/v2\/pages\/27"}],"wp:attachment":[{"href":"http:\/\/www.kelp-ml.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=171"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}