How to Sync an S3 Bucket with GitHub Actions
An Abbreviated Guide to Navigating Github Actions
Table of Contents
- Ramblings About My Fave Technologies
- Why Sync With An S3 Bucket
- What Are GitHub Actions
- Key Concepts You Need To Know
- Workflow vs. Job vs. Steps
- Why Use GitHub Actions
- Creating Your Own Action
- Using GitHub Actions To Sync A Directory With S3
- Testing An Action
- Interpreting Failures
- Successfully Syncing To S3
- Learn More
My Favorite Technologies
People often ask me what part of the stack do you prefer, and for fear of pigeonholing myself too early in my career, I always say, “I like it all!” It’s not necessarily a lie, but in this moment, I can honestly admit that the tickets I get most excited about include:
- CSS: Out of frustration and influence from other slightly ignorant developers, I swore off CSS. I was tired of writing margin-left: -33.3333333%. After discovering that CSS is so powerful, developers can literally recreate Da Vinci’s Mona Lisa, I’ve happily challenged myself to embrace CSS and write cleaner CSS.
- SQL: Besides HTML, this was the first language I learned and the simplicity of the language helped me to understand the purpose of programming. To me, programming is just retrieving, manipulating, and displaying data. Writing SQL evokes a sense of nostalgia for me.
- React.js: I learned React near the end of my coding bootcamp. During that time, my demo day project was due, I was stressed, experiencing information overload, and mostly focused on graduating. So, while I was learning React, all I saw was HTML mixed with JavaScript and CSS. I had zero understanding of why someone would use it. After I landed a real developer job, I recognized the value of component-based architecture.
- AWS: While I am not a fan of Jeff Bezos, Amazon, or capitalism, AWS inevitably has me in a chokehold! I started using AWS at Hi Marley. It was our bread and butter. From DynamoDB to AWS Lambda, I gained new skills and a new respect for cloud computing. And of course, AWS enabled me to create the coolest feature I’ve ever developed to date: AutoTranslate (via Amazon Translate)
Basically, the key to my heart is through Front end, Databases, or DevOps.
Initial excitement of one of my first encounters with AWS
My love language for programming languages is complaining
More complaining LOL
And even more complaining. In my defense, it was pretty confusing!
Annnnnnd then realizing how much I actually loved and missed AWS lol
It’s safe to say I have more of a love/hate relationship with AWS as I do with most technologies.
New Ticket Alert
Obviously, I missed using AWS, so I was excited when my manager asked me if I wanted to pick up a ticket that involved tinkering with AWS and GitHub.
The problem: When people on our team need to add new assets to the codebase, we often have to upload the assets to AWS S3, but sometimes we have to request access or temporarily update permissions, which feels like a time sink and creates unnecessary blockers. Instead, it would be nice if we could:
- Treat our assets as code
- Focus solely on merging assets as we do with code
- Sync our assets to an S3 bucket
The solution (suggested by my manager): GitHub Actions
Although I knew that my company used GitHub Actions, I wasn’t really sure what it was because I had never personally used it, so I set out to answer a few questions:
What are GitHub Actions?
GitHub Actions is a tool that conveniently enables you to automate custom workflows inside of your GitHub repository. While there is an option to write actions on your own, there is a marketplace filled with already made actions created by developers around the world. Organizations commonly use GitHub Actions for repeated tasks such as, checking for passing tests and release management.
Essentially, you can trigger a workflow to run a set of jobs when a specified event occurs in your repository.
What do any of these words mean?
- Workflow - a configurable, automated process, written as YAML files, composed of one or more Steps, each of which is an Action. All Workflows can be triggered by Events.
- Job - a collection of Steps in a single workflow that executes one or more steps
- Step - a component of a Job, which runs commands and actions
- Action- a reusable language-agnostic block of code that can execute a step in a Workflow. Actions can be combined as steps to create a Job.
- Event - a listener that triggers a Workflow based on some activity.
Why isn't a workflow just a collection of steps?
As I mentioned earlier, a workflow is a collection of jobs, and a job is a collection of steps. Containerizing steps within a job enables you to configure how and when steps run. By default, jobs run in parallel, but you can run a job sequentially by depending on the result of other jobs. GitHub’s documentation explains how to assign and use a job_id to run your jobs sequentially.
Why use GitHub Actions?
- It’s free to use for any GitHub user! Well, it is mostly free. The free plan generously provides developers with 2000 free build minutes per month.
- It’s built into GitHub! Who loves using fewer tools to accomplish the same task? I do! Because GitHub Actions is owned by GitHub and fully integrated into GitHub, you don’t need to configure any third-party tools or do too much configuration at all. By default, GitHub Actions is enabled on all repositories and for all organizations. You can easily manage your ci/cd pipeline on the same platform that you manage your repository.
- It supports multi-containers! Developers can add support for Docker and docker-compose files to GitHub Action workflows, which allows you to test multi-container setups.
- There’s a bit more flexibility! With GitHub Actions, developers have access to GitHub's API.
- It’s great for organizations of all sizes, but especially for startups, in my opinion. The benefits I listed above may come in handy for engineering teams at startup companies that often need something inexpensive, fast, flexible, and easy to maintain their ci/cd pipeline.
- Your ci/cd pipeline configuration is treated as code as lives alongside your project’s code!
- There is a vibrant, open-source community! Developers are encouraged to build, publish, and use actions on GitHub Marketplace. Like I mentioned before, instead of creating your own action from scratch, you can literally copy and paste a ready-made template you found on the marketplace. I’ve also found that GitHub Actions don’t have to be used only for business. Some developers have opted to create a few fun actions.
Cool/Random/Silly GitHub Actions I’ve Found:
- A blog post workflow
- Automating Curation of Photography
- Turning on Smart Lights with a Commit
- Uploading a Youtube Video to AnchorFm
- Automating a venmo payment
How do I write my own GitHub Actions?
To write your own workflow for GitHub Actions, you need to create a folder at the root of your project named .github/workflows/
Inside the directory is where you will store your workflows. Each workflow is written using YAML syntax.
How do I sync a directory with a remote s3 bucket?
Conveniently, I didn't have to write any of my own actions. I found an action on GitHub that did exactly what I wanted. It was called GitHub Action to Sync S3 Bucket. I simply copied and pasted the pre-written code and followed the directions in the ReadMe.
- name: run s3-sync
uses: jakejarvis/s3-sync-action@master
with:
args: --acl public-read --follow-symlinks --delete --exclude '.git/*' --exclude '.github/*'
env:
AWS_S3_BUCKET: ${{ format('botany-nudge-assets-{0}', env.ENVIRONMENT_NAME) }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
SOURCE_DIR: './nudges/assets'
I did have to preface the prewritten action with a few more instructions to ensure that the workflow ran when desired:
name: s3-sync
on:
push:
branches:
- dev
- production
paths:
- 'folder/path/**'
This defines the name of the workflow.
name: s3-sync
This refers to the event that will trigger the workflow, so my workflow will only run if a push event occurs.
on:
push:
This refers to the branches in which this workflow will occur. My workflow will only be triggered if the event occurs on the dev or production branches.
branches:
- dev
- production
My workflow will only occur if there are changes to a specific directory.
paths:
- 'folder/path/**'
To summarize, I customized my workflow to run when someone pushes a change to a specified directory on a dev or production branch.
How do I test my GitHub Action?
The one disadvantage I found with using GitHub Actions is the amount of time it takes to test. To ensure my script was working the way I intended, I had to make a pull request and wait for the action to run after each commit. This resulted in 99+ commits! (Disclaimer: I squashed all these commits before merging). To avoid having a million commits, you can test your action locally with this nifty tool.
How do I read the logs?
Reading logs can often feel like finding a needle in a haystack. That experience is not that different for GitHub Actions, but there are few pros such as:
- You can watch the log activity in real-time via your browser
- Improved readability since logs are categorized by steps
Was I successful?
After I resolved all the failures, I had one more check left. Ensuring that the assets from my directory synced with the s3 bucket. I hesitantly logged into AWS, headed over to Amazon S3, and opened the target s3 bucket.
DUN, DUN, DUN….
The bucket was filled with all the intended assets!
I was so happy that I was able to complete a successful run! Overall, it was a great experience (even though not that much AWS was involved). It’s okay because I learned something new. GitHub Actions is fast, lightweight, convenient, and open-source, so you’re rarely ever starting from scratch. My favorite feature is that it allows you to write expressions, variables, and conditional statements, which makes the tooling seem limitless.
Here’s an example of how I used conditional statements to execute different actions depending on the environment name:
uses: jakejarvis/s3-sync-action@master
with:
args: --acl public-read --follow-symlinks --delete
if: env.ENVIRONMENT_NAME == 'staging'
env:
AWS_S3_BUCKET: ${{ secrets.STAGING_NUDGE_ASSET_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
SOURCE_DIR: './nudges/assets'
- name: run s3-sync on prod
uses: jakejarvis/s3-sync-action@master
with:
args: --acl public-read --follow-symlinks --delete
if: env.ENVIRONMENT_NAME == 'production'
env:
AWS_S3_BUCKET: ${{ secrets.PRODUCTION_NUDGE_ASSET_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
SOURCE_DIR: './nudges/assets'
How can I learn more?
If you’re looking for a quick guide to GitHub Actions, check out the link below: github.github.io/actions-cheat-sheet/action..
And here is a video playlist guide to GitHub Actions from GitHub Dev Advocate, Brian Douglas.
-- Rizel B.