GitXplorerGitXplorer
x

machinelearningext

public
2 stars
3 forks
7 issues

Commits

List of commits on branch master.
Unverified
e73856b1b99b248722d68d19b498a1f773929f33

Update to the latest machinelearning (#65)

ssdpython committed 6 years ago
Unverified
f6df2adcaa5b5e03d62a8b197410fc84cd3af2d6

Update to the latest machine learning (#64)

ssdpython committed 6 years ago
Unverified
1b0416ec82d35bc24d71ca8a83d4e68701b76d71

update to the latest machinelearning (#63)

ssdpython committed 6 years ago
Unverified
e5255faa9cc7b8e2102476656c0fc1a0062e5b50

Update ml.net (#62)

ssdpython committed 6 years ago
Unverified
35918212281731a3a697eea3d1259766e9cb9e73

Update to the latest machine learning (#61)

ssdpython committed 6 years ago
Unverified
a7cee38df8b6b928ff27f6555ae5a6c26a24cba7

Update to the latest machine learning (#60)

ssdpython committed 6 years ago

README

The README file for this repository.

Custom Extensions to ML.net

This project proposes some extension to machinelearning written in C#. Work in progress.

TravisCI Build status CircleCI

TravisCI Build status CircleCI

Build

On windows: build.cmd or build ml to force rebuilding machinelearning.

On Linux: build.sh.

The documentation can be build with: doxygen conf.dox.

Documentation

Example 1: Inner API

This example relies on the inner API, mostly used inside components of ML.net.

var env = new TlcEnvironment();
var iris = "iris.txt";

// We read the text data and create a dataframe / dataview.
var df = DataFrameIO.ReadCsv(iris, sep: '\t',
                             dtypes: new DataKind?[] { DataKind.R4 });

// We add a transform to concatenate two features in one vector columns.
var conc = env.CreateTransform("Concat{col=Feature:Sepal_length,Sepal_width}", df);

// We create training data by mapping roles to columns.
var trainingData = env.CreateExamples(conc, "Feature", label: "Label");

// We create a trainer, here a One Versus Rest with a logistic regression as inner model.
var trainer = env.CreateTrainer("ova{p=lr}");

using (var ch = env.Start("test"))
{
    // We train the model.
    var pred = trainer.Train(env, ch, trainingData);

    // We compute the prediction (here with the same training data but it should not be the same).
    var scorer = ScoreUtils.GetScorer(pred, trainingData, env, null);

    // We store the predictions on a file.
    DataFrame.ViewToCsv(scorer, "iris_predictions.txt", host: env);

    // Or we could put the predictions into a dataframe.
    var dfout = DataFrameIO.ReadView(scorer);

    // And access one value...
    var v = dfout.iloc[0, 7];
    Console.WriteLine("PredictedLabel: {0}", v);
}

The current interface of DataFrame is not rich. It will improve in the future.

Example 2: Inner API like Scikit-Learn

This is the same example but with a ScikitPipeline which looks like scikit-learn.

var env = new TlcEnvironment();
var iris = "iris.txt";

// We read the text data and create a dataframe / dataview.
var df = DataFrameIO.ReadCsv(iris, sep: '\t',
                             dtypes: new DataKind?[] { DataKind.R4 });

var pipe = new ScikitPipeline(new[] { "Concat{col=Feature:Sepal_length,Sepal_width}" }, "ova{p=lr}");
pipe.Train(df, feature: "Feature", label: "Label");

var scorer = pipe.Predict(df);

var dfout = DataFrameIO.ReadView(scorer);

// And access one value...
var v = dfout.iloc[0, 7];
Console.WriteLine("PredictedLabel: {0}", v);

Example 3: DataFrame in C#

The class DataFrame replicates some functionalities datascientist are used to in others languages such as Python or R. It is possible to do basic operations on columns:

var text = "AA,BB,CC\n0,1,text\n1,1.1,text2";
var df = DataFrameIO.ReadStr(text);
df["AA+BB"] = df["AA"] + df["BB"];
Console.WriteLine(df.ToString());
AA,BB,CC,AA+BB
0,1,text,1
1,1.1,text2,2.1

Or:

df["AA2"] = df["AA"] + 10;
Console.WriteLine(df.ToString());
AA,BB,CC,AA+BB,AA2
0,1,text,1,10
1,1.1,text2,2.1,11

The next instructions change one value based on a condition.

df.loc[df["AA"].Filter<DvInt4>(c => (int)c == 1), "CC"] = "changed";
Console.WriteLine(df.ToString());
AA,BB,CC,AA+BB,AA2
0,1,text,1,10
1,1.1,changed,2.1,11

A specific set of columns or rows can be extracted:

var view = df[df.ALL, new [] {"AA", "CC"}];
Console.WriteLine(view.ToString());
AA,CC
0,text
1,changed

The dataframe also allows basic filtering:

var view = df[df["AA"] == 0];
Console.WriteLine(view.ToString());
AA,BB,CC,AA+BB,AA2
0,1,text,1,10