Machine Learning in Go

There are limited options for doing machine learning in the Go ecosystem. I’ve found a curated list here as a good starting point and it covers most of the main packages I’ve found through search. When looking for golang packages the best places to start are here and here.

Currently, there are only two general purpose libraries (golearn and mlgo) and several algorithm specific ones. Of the two general purpose libraries golearn is in a much better state than mlgo. Golearn has more algorithms, is easier to use and has more active contributors. I admit some bias, I have a pull request for adding an average perceptron classifier to golearn waiting right now. Decide for yourself by following the links.

Go Learn – Machine Learning for Go
Mlgo – General Machine Learning for Go

Golearn currently has the following algorithms implemented (as of this writing of course) and support for CSV and ARFF files. It also supports cross validation. The repo has several datasets useful for testing out algorithms with located here.

SVM
Linear Regression
KNN Classification
KNN Regression
Neural Networks
Naive Bayes
Decision Trees (Random and ID3)
RandomForest
Multiple options for pairwise functions (see)

The best place to get started with golearn is the actual README as it will get you through the install process and has a test script to get you started. The next place you will want to look is the example folder as there are several examples you can run that show how to load a CSV, split it into train/test sets then train and pull out the results.

Here is an example pulled out the examples folder for using the KNNClassifier to predict after doing a train / test split on some flower data.

package main

import (
	"fmt"
	"github.com/sjwhitworth/golearn/base"
	"github.com/sjwhitworth/golearn/evaluation"
	"github.com/sjwhitworth/golearn/knn"
)

func main() {
	rawData, err := base.ParseCSVToInstances("../datasets/iris_headers.csv", true)
	if err != nil {
		panic(err)
	}

	//Initialises a new KNN classifier
	cls := knn.NewKnnClassifier("euclidean", 2)

	//Do a training-test split
	trainData, testData := base.InstancesTrainTestSplit(rawData, 0.50)
	cls.Fit(trainData)

	//Calculates the Euclidean distance and returns the most popular label
	predictions := cls.Predict(testData)
	fmt.Println(predictions)

	// Prints precision/recall metrics
	confusionMat, err := evaluation.GetConfusionMatrix(testData, predictions)
	if err != nil {
		panic(fmt.Sprintf("Unable to get confusion matrix: %s", err.Error()))
	}
	fmt.Println(evaluation.GetSummary(confusionMat))
}

Golearn makes it easy to crunch data using Go but there are some challenges when it comes to putting it into a data stream analysis pipeline. Part of the issue stems from how the data model is constructed and the assumption that classifiers are not long lived processes that are loading up pre-trained. For a talk I tried to make a simple server out of the above example that would expose the classifier over a rest API and ended up just quickly writing my own KNNClassifier so that I could manage state easier. All in all it is turning into a fairly decent package for simple ML tasks and I think it has a bright future ahead of it.

You may also like...

2 Responses

  1. Gibran says:

    Thats great your Go packages. Are there some way to save the model created, for use after?

  2. Ross says:

    You can use the GOB package and just serialize up and write out the trained model to disk.

Leave a Reply

Your email address will not be published. Required fields are marked *