Links

Read dataframe from S3

Tags: #aws #cloud #storage #S3bucket #operations #snippet #dataframe
Author: Maxime Jublou​
Reference : AWS Data Wrangler​

Input

Import libraries

try:
import awswrangler as wr
except:
!pip install awswrangler --user
import awswrangler as wr

Setup AWS

# Credentials
AWS_ACCESS_KEY_ID = "YOUR_AWS_ACCESS_KEY_ID"
AWS_SECRET_ACCESS_KEY = "YOUR_AWS_SECRET_ACCESS_KEY"
AWS_DEFAULT_REGION = "YOUR_AWS_DEFAULT_REGION"
​
# Bucket
BUCKET_PATH = f"s3://naas-data-lake/dataset/"

Setup Env

%env AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
%env AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
%env AWS_DEFAULT_REGION=$AWS_DEFAULT_REGION

Model

Get dataframe

df = wr.s3.read_parquet(BUCKET_PATH, dataset=True)

Output

Display result

df