首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
社区首页 >问答首页 >ML.Net机器学习

ML.Net机器学习
EN

Stack Overflow用户
提问于 2019-10-05 10:21:07
回答 2查看 875关注 0票数 2

我使用了微软的ML.Net机器学习。我想打印训练过程中使用的处理输入向量。我能打印出来吗?

代码语言:javascript
运行
AI代码解释
复制
    private static string _appPath => Path.GetDirectoryName(Environment.GetCommandLineArgs()[0]);

    //TRAIN_DATA_FILEPATH: has the path to the dataset used to train the model.
    private static string TRAIN_DATA_FILEPATH => Path.Combine(_appPath, "..", "..", "..", "Data", "A.csv");
    //@"C:\Users\taqwa\Desktop\A.csv"
    private static string MODEL_FILEPATH = @"../../../../MyMLAppML.Model/MLModel.zip";

    // Create MLContext to be shared across the model creation workflow objects 
    // Set a random seed for repeatable/deterministic results across multiple trainings.
    private static MLContext mlContext = new MLContext(seed: 1);

    public static void CreateModel()
    {
        // Load Data
        //ModelInput is the input dataset class and has the following String fields: Cases, Algorith and InjuryOrIllness 
        IDataView trainingDataView = mlContext.Data.LoadFromTextFile<ModelInput>(
                                        path: TRAIN_DATA_FILEPATH,
                                        hasHeader: true,   //true if the Header property is not null; otherwise, false. The default is false.
                                        separatorChar: ',',
                                        allowQuoting: true,  //Whether the file can contain columns defined by a quoted string. Whether the input may include quoted values, which can contain separator characters, colons, and distinguish empty values from missing values. When true, consecutive separators denote a missing value and an empty value is denoted by "". When false, consecutive separators denote an empty value.
                                        allowSparse: false); //Whether the file can contain numerical vectors in sparse format.


        // Build training pipeline
        IEstimator<ITransformer> trainingPipeline = BuildTrainingPipeline(mlContext);

        // Evaluate quality of Model
    //    Evaluate(mlContext, trainingDataView, trainingPipeline);

        // Train Model
        ITransformer mlModel = TrainModel(mlContext, trainingDataView, trainingPipeline);

        // Save model
      //  SaveModel(mlContext, mlModel, MODEL_FILEPATH, trainingDataView.Schema);
    }

    public static IEstimator<ITransformer> BuildTrainingPipeline(MLContext mlContext)
    {
        // Data process configuration with pipeline data transformations 
        var dataProcessPipeline = mlContext.Transforms.Conversion.MapValueToKey("Algorithm", "Algorithm")
                                  //MapValueToKey: method to transform the Algorithm column into a numeric key type Algorithm column (a format accepted by classification algorithms) and add it as a new dataset column
                                  .Append(mlContext.Transforms.Categorical.OneHotEncoding(new[] { new InputOutputColumnPair("injuryOrIllness", "injuryOrIllness") }))
                                  //OneHotEncoding: which converts one or more input text columns specified in columns into as many columns of one-hot encoded vectors.
                                  .Append(mlContext.Transforms.Text.FeaturizeText("Cases_tf", "Cases"))
                                  //FeaturizeText which transforms the text (Cases_tf) columns into a numeric vector for each called Cases and Append the featurization to the pipeline
                                  .Append(mlContext.Transforms.Concatenate("Features", new[] { "injuryOrIllness", "Cases_tf" }))
                                  .Append(mlContext.Transforms.NormalizeMinMax("Features", "Features"))
                                  //AppendCacheCheckpoint to cache the DataView so when you iterate over the data multiple times using the cache might get better performance
                                  .AppendCacheCheckpoint(mlContext);


        // Set the training algorithm 
        //Here we used the AveragedPerceptron
        var trainer = mlContext.MulticlassClassification.Trainers.OneVersusAll(mlContext.BinaryClassification.Trainers.AveragedPerceptron(labelColumnName: "Algorithm", numberOfIterations: 10, featureColumnName: "Features"), labelColumnName: "Algorithm")
                                  .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel", "PredictedLabel"));
        //var trainer = mlContext.MulticlassClassification.Trainers.SdcaMaximumEntropy(labelColumnName: "Algorithm", featureColumnName: "Features")
        //              .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel", "PredictedLabel"));

        //OneVersusAllTrainer: which predicts a multiclass target using one-versus-all strategy with the binary classification estimator specified by binaryEstimator.
        var trainingPipeline = dataProcessPipeline.Append(trainer);


        return trainingPipeline;

    }


    public static ITransformer TrainModel(MLContext mlContext, IDataView trainingDataView, IEstimator<ITransformer> trainingPipeline)
    {
        Console.WriteLine("=============== Training  model ===============");

        //Fit(): method trains your model by transforming the dataset and applying the training. and return the trained model.
        ITransformer model = trainingPipeline.Fit(trainingDataView);
        Console.WriteLine($"{trainingDataView.Schema}");



        Console.WriteLine("=============== End of training process ===============");
        return model;
    }

这是我代码的一部分。我尝试打印训练过程中使用的处理或特征化输入向量。

因此,我尝试将(trainingDataView.Schema)打印为Console.WriteLine($"{trainingDataView.Schema}");,但补丁看起来像(非公共成员)。

EN

回答 2

Stack Overflow用户

发布于 2019-11-28 08:41:28

您尝试过使用Preview()方法吗?Preview既可以在IEstimator上使用,也可以在ITransformer上使用。您可以使用GetColumn<>从IDataView获取特定列的值。此外,请查看此文档页面https://docs.microsoft.com/cs-cz/dotnet/machine-learning/how-to-guides/inspect-intermediate-data-ml-net

票数 1
EN

Stack Overflow用户

发布于 2020-03-21 23:40:46

您可以检查数据的模式或迭代每一行。

在第一种情况下,您可以使用:

代码语言:javascript
运行
AI代码解释
复制
var schema = data.Preview();

否则,您可以通过以下方式进行迭代:

代码语言:javascript
运行
AI代码解释
复制
 IEnumerable<ModelInput> inputData = mlContext.Data.CreateEnumerable<ModelInput>(data, reuseRowObject: true);

 foreach (ModelInput row in inputData)
 {
       foreach (var prop in row.GetType().GetProperties())
       {
            Console.WriteLine("{0}={1}", prop.Name, prop.GetValue(row, null));
       }
 }
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58247400

复制
相关文章

相似问题

领券
社区富文本编辑器全新改版!诚邀体验~
全新交互,全新视觉,新增快捷键、悬浮工具栏、高亮块等功能并同时优化现有功能,全面提升创作效率和体验
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档
查看详情【社区公告】 技术创作特训营有奖征文