Skip to main content

32 docs tagged with "data"

View All Tags

Flow Artifacts

How is data stored in Metaflow runs? What can I track with Metaflow?

Load Parquet Data from S3 to Arrow Table

I have a Parquet dataset stored in AWS S3 and want to access it in a Metaflow flow. How can I read one or several Parquet files at once from a flow and use them in an Arrow table?

Load Parquet Data from S3 to Pandas DataFrame

I have a Parquet dataset stored in AWS S3 and want to access it in a Metaflow flow. How can I read one or several Parquet files at once from a flow and use them in a Pandas dataframe?

Python Data Structures for Tabular Data

Python is one of the main programming languages of data science and machine learning practitioners. This makes it important for programmers and scientists to understand the most common data structures used in Python’s machine learning ecosystem.