Project Description
This library is designed to assist in the use of common Machine Learning Algorithms in conjunction with the .NET platform. It is designed to include the most popular supervised and unsupervised learning algorithms while minimizing the friction involved with
creating the predictive models.
Supervised Learning
Supervised learning is an approach in machine learning where the system is provided with labeled examples of a problem and the computer creates a model to predict future unlabeled examples. These classifiers are further divided into the following sets:
- Binary Classification – Predicting a Yes/No type value
- Multi-Class Classification – Predicting a value from a finite set (i.e. {A, B, C, D } or {1, 2, 3, 4})
- Regression – Predicting a continuous value (i.e. a number)
Currently the library contains:
- The Perceptron Algorithm (Only supports binary classification for now)
Using this algorithm involves marking up a class with features (used to predict) and the target label (what to predict).
1: public enum Grade
2: {
3: A,
4: B,
5: C,
6: D,
7: F
8: }
9:
10: public class Student
11: {
12: [Feature]
13: public string Name { get; set; }
14:
15: [Feature]
16: public Grade Grade { get; set; }
17:
18: [Feature]
19: public double GPA { get; set; }
20:
21: [Feature]
22: public int Age { get; set; }
23:
24: [Feature]
25: public bool Tall { get; set; }
26:
27: [Feature]
28: public int Friends { get; set; }
29:
30: [Label]
31: public bool Good { get; set; }
32: }
Once a class is marked up in this fashion, the a classifier is generated from a list of objects instantiated from the data class. A new object can then be the target of prediction based on the model generated:
1: var model = new PerceptronModel<Student>();
2: var classifier = model.Generate(ListOfStudents);
3:
4: classifier.Predict(NewStudentObject);
- Decision Tree Classifier (Supports Binary and Multi-Class Classification)
The same labeling (attribute) technique is used to mark up a class definition.
Using the DT algorithm is similar to the above usage of the Perceptron algorithm:
1: var model = new DecisionTreeModel<Student>(3, ImpurityType.Entropy);
2: var predictor = model.Generate(ListOfStudents);
3:
4: predictor.Predict(NewStudent);
This model has two additional parameters: Tree Height (which is set to 3 in this instance) and Impurity Measure to find information gain (this example uses
Entropy:
Gini and Max Error are also included).
Unsupervised Learning
Unsupervised learning is an approach which involves learning about the shape of unlabeled data. This library currently contains:
- KMeans – Performs automatic grouping of data into K groups (specified a priori)
Labeling data is the same as for the supervised learning algorithms with the exception that these algorithms ignore the [Label] attribute:
1: var kmeans = new KMeans<Student>();
2: var grouping = kmeans.Generate(ListOfStudents, 2);
Here the KMeans algorithm is grouping the ListOfStudents into two groups returning an array corresponding to the appropriate group for each student (in this case group 0 or group 1)
- Hierarchical Clustering – In progress!
Planning
Currently planning/hoping to do the following:
- Boosting/Bagging
- Hierarchical Clustering
- Naïve Bayes Classifier
- Collaborative filtering algorithms (suggest a product, friend etc.)
- Latent Semantic Analysis (for better searching of text etc.)
- Support Vector Machines (more powerful classifier)
- Principal Component Analysis – Aids in dimensionality reduction which should allow/facilitate learning from images
- *Maybe* – Common AI algorithms such as
A*, Beam Search,
Minimax etc.
Contact Me
You can email me at sethj AT devexpress.com or follow me on twitter:
@sethjuarez.