Training and Experiment Tracking
Neu.ro Tool Integrations
In our experience, nearly all AI development efforts, be they at large enterprises or new startups, begin by spending the first 3-6 months building their first ML pipelines from available tools. These custom integrations are time consuming and expensive to produce, can be fragile and frequently require drastic changes as project requirements evolve. Frequently, these custom ML pipelines only support a small set of built-in algorithms or a single ML library and are tied to each company’s existing infrastructure. Users cannot easily leverage new ML libraries, or share their work with a wider community.
Neuro facilitates adoption of robust, adaptable Machine Learning Operations (MLOps) by simplifying resource orchestration, automation and instrumentation at all steps of ML system construction, including integration, testing, deployment, monitoring and infrastructure management.
To maintain agility and avoid the pitfalls of technical debt, Neuro allows for the seamless connection of an ever-expanding universe of ML tools into your workflow.
We cover the entire ML lifecycle from Data Collection to Testing and Interpretation. All resources, processes and permissions are managed through our Neu.ro platform and can be installed and run on virtually any compute infrastructure, be it on-premise or in the cloud of your choice.
Training & Experiment Tracking
The various components of a machine learning workflow can be split up into independent, reusable, modular parts that can be pipelined together to create, test and deploy models.
Our toolset integrator, Neu.ro Toolbox, contains up to date out of the box integrations with a wide range of open-source and commercial tools required for modern ML/AI development.
For Training and Experiment Tracking, the Neu.ro Platform provides native functionality and provides out of the box integration with W&B (Weights & Biases), as well as the open source NNI (Neural Network Intelligence) for hyperparameter tuning and experiment tracking.
According to W&B founder Lukas Biewald, while machine learning practitioners are often compared to software developers, “they’re more like scientists in some ways than engineers.” It’s a process that involves numerous experiments. These solutions in coordination with Neu.ro platform allow practitioners to track their experiments as well as offer tools for data set versioning, model evaluation and pipeline management.
Weights & Biases:
W&B provides a leading suite of developer tools for machine learning, including metadata management, model management, training and experiment tracking
W&B helps ML development teams track their models, visualize model performance and easily automate model training and iterative improvement.
W&B has been called a system of record for your model results: Add a few lines to your script, and each time you train a new version of your model, you’ll see a new experiment stream live to your dashboard.
Within the Neu.ro Platform, W&B can be integrated where needed to handle training, experiment tracking, model management and metadata management
NNI (Neural Network Intelligence):
NNI is a free and open source AutoML toolkit developed by Microsoft. It is used to automate feature engineering, model compression, neural architecture search, and hyper-parameter tuning. The source code is licensed under MIT License and available on GitHub
Neu.ro doc:
Hyperparameter tuning with NNI
- MLflow (open source)
- TensorBoard
- DVC (open source)
MLflow:
MLflow is s an open source platform for manage the ML lifecycle created by Databricks, that includes experimentation, reproducibility, deployment, and a central model registry.
MLflow Tracking is an API and UI for logging parameters, code versions, metrics and output files when running machine learning code for later visualization.
MLflow Projects provide a standard format for packaging reusable data science code. Each project is a directory with code or a Git repository, and uses a YAML descriptor file to specify its dependencies and how to run the code. Projects can specify their dependencies through a Conda environment.
MLflow Models is a convention for packaging machine learning models in multiple formats called “flavors”. MLflow offers a variety of tools to help you deploy different flavors of models. Each MLflow Model is saved as a directory containing arbitrary files and an MLmodel descriptor file that lists the flavors it can be used in.
Tensorboard:
Tensorboard is TensorFlow’s visualization toolkit. It is a tool for providing the measurements and visualizations of metrics such as loss and accuracy needed during the machine learning workflow. It allows for visualizing the model graph (ops and layers), viewing histograms of weights, biases, or other tensors as they change over time and projecting embeddings to a lower dimensional space
Neu.ro Doc:
DVC:
DVC is an open source version control system for machine learning projects that tracks ML models and data sets and allows models to be shareable and reproducible.
The complete evolution of every ML model can be tracked with full code and data provenance. This guarantees reproducibility and makes it easy to switch back and forth between experiments.
DVC also allows for easy experiment reproduction by consistently maintaining a combination of input data, configuration, and the code that was initially used to run an experiment.
Finally, DVC has full support for data branching: DVC supports instantaneous Git branching, even with large files. Users can harness the full power of Git branches to try different ideas instead of sloppy file suffixes and comments in code. Data is not duplicated — one file version can belong to dozens of experiments and users can create as many experiments as they want, instantaneously switching from one to another, and with a saved history of all attempts.
Additional Toolset integrations in Neu.ro are:
- Label Studio (Open source)
- DVC (open source)
- Pachyderm
- VS Code (open source)
- Jupyter (open source)
- Git (open source)
- Neu.ro native
- YAML
- Docker
- Neu.ro native
- NNI (open source)
- W&B
- MLflow (open source)
- W&B
- Neu.ro native
- MLflow (open source)
- W&B
- TensorBoard
- DVC (open source)
- MLflow (open source)
- W&B
- Neu.ro native
- Algorithmia
- Seldon Core (open source)
- Algorithmia
- Prometheus + Grafana (open source)
- Fiddler
- Seldon Alibi (open source)
- WhyLabs