32 docs tagged with "data"

View All Tags

Beginner Computer Vision: Episode 1

A tutorial that uses Keras, Scikit-Learn, and Metaflow to operationalize a machine learning workflow.

Beginner Computer Vision: Episode 2

A tutorial that uses Keras, Scikit-Learn, and Metaflow to operationalize a machine learning workflow.

Beginner Computer Vision: Episode 3

A tutorial that uses Keras, Scikit-Learn, and Metaflow to operationalize a machine learning workflow.

Beginner Computer Vision: Episode 4

A tutorial that uses Keras, Scikit-Learn, and Metaflow to operationalize a machine learning workflow.

Beginner Computer Vision: Episode 5

A tutorial that uses Keras, Scikit-Learn, and Metaflow to operationalize a machine learning workflow.

Beginner Computer Vision: Episode 6

A tutorial that uses Keras, Scikit-Learn, and Metaflow to operationalize a machine learning workflow.

Beginner Recommender Systems: Episode 1

A tutorial that uses DuckDB, Gensim, Keras, and Metaflow to operationalize a machine learning workflow.

Beginner Recommender Systems: Episode 2

A tutorial that uses DuckDB, Gensim, Keras, and Metaflow to operationalize a machine learning workflow.

Beginner Recommender Systems: Episode 3

A tutorial that uses DuckDB, Gensim, Keras, and Metaflow to operationalize a machine learning workflow.

Beginner Recommender Systems: Episode 4

A tutorial that uses DuckDB, Gensim, Keras, and Metaflow to operationalize a machine learning workflow.

Beginner Recommender Systems: Episode 5

A tutorial that uses DuckDB, Gensim, Keras, and Metaflow to operationalize a machine learning workflow.

Beginner Recommender Systems: Episode 6

A tutorial that uses DuckDB, Gensim, Keras, and Metaflow to operationalize a machine learning workflow.

Chunk a Dataframe to Parquet

I have a large pandas dataframe in memory. How can I chunk it into Parquet files using Metaflow?

Define Lists as Parameters

How do I define Python list objects as parameters of a flow?

Flow Artifacts

How is data stored in Metaflow runs? What can I track with Metaflow?

How to quickly load tabular data from cloud storage?

I have a set of parquet files in the cloud, and want to read them into memory on my remote workers quickly. How can I do this with Metaflow?

Intermediate Computer Vision: Episode 1

A tutorial that uses PyTorch and Metaflow to create a machine learning workflow for rapid prototyping.

Intermediate Computer Vision: Episode 2

A tutorial that uses PyTorch and Metaflow to create a machine learning workflow for rapid prototyping.

Intermediate Computer Vision: Episode 3

A tutorial that uses PyTorch and Metaflow to create a machine learning workflow for rapid prototyping.

Intermediate Computer Vision: Episode 4

A tutorial that uses PyTorch and Metaflow to create a machine learning workflow for rapid prototyping.

Intermediate Computer Vision: Episode 5

A tutorial that uses PyTorch and Metaflow to create a machine learning workflow for rapid prototyping.

Intermediate Computer Vision: Episode 6

A tutorial that uses PyTorch and Metaflow to create a machine learning workflow for rapid prototyping.

Load CSV Data in Metaflow Steps

I have a CSV and want to access it in a Metaflow flow. How can I read this data into tasks and write it to disk?

Load Local Data with IncludeFile

How do I load data from a local directory structure on AWS Batch using Metaflow's IncludeFile?

Load Parquet Data from S3 to Arrow Table

I have a Parquet dataset stored in AWS S3 and want to access it in a Metaflow flow. How can I read one or several Parquet files at once from a flow and use them in an Arrow table?

Load Parquet Data from S3 to Pandas DataFrame

I have a Parquet dataset stored in AWS S3 and want to access it in a Metaflow flow. How can I read one or several Parquet files at once from a flow and use them in a Pandas dataframe?

Natural Language Processing - Episode 1

A tutorial that uses Keras, Scikit-Learn, and Metaflow to operationalize a machine learning workflow.

Python Data Structures for Tabular Data

Python is one of the main programming languages of data science and machine learning practitioners. This makes it important for programmers and scientists to understand the most common data structures used in Python’s machine learning ecosystem.

Run SQL Query with AWS Athena

How can I access data in S3 with a SQL query from my Metaflow flow?

Run SQL Query with Pandas

How do I query a database with SQL and load the results into Pandas?

Share Local Data with S3

How do I load data from a local directory structure on AWS Batch using Metaflow's S3 client?

Whether to Use a Flow's self Keyword

How to determine whether I should store data in a flow's self keyword?