Stack & Infrastructure

We used a variety of technologies to build Burst Your Bubble, including frameworks we had familiarity with, and some that we wanted to learn.

The stack

It was important for us to allow our web application to communicate easily with our machine learning models and the database. Because our article classification was being done in Python with the sklearn framework, we built the backend in Python as a Flask application. Flask also allows us to create an API with very little overhead, while still supporting crucial web technologies like session data and cookies, which we need for user authentication.

We then stored the data required for the web app in a MySQL database. We decided to use a relational database like MySQL so that we can easily traverse the objects we are creating and using (for instance, going from a user to the list of articles they have read). Additionally, this allowed us to use a popular Object-Relational Mapping (ORM) package to abstract away from the database engine; in this case, Python's SQLAlchemy package provided that connection.

On the front end, we built the application using React. React is a popular user interface framework that reduces the application down to reusable components, which is extremely useful for complex applications like ours that have UI items that need to be used in several locations. Prior to the development phase of the project none of us were familiar with React, but we wanted to learn how to develop user interfaces with it.

Amazon Web Services

We also wanted to gain experience with cloud deployments. While the project can be run fully locally with the built-in Flask server, we wanted to set up a production environment where we could all interact with the app and know we were all using the same code, underlying operating system, packages, and database. We set up an account on the Amazon Web Services free tier and set up the following services:

  • Elastic Compute Cloud (EC2): The actual server that runs our code and a cron task that runs daily to fetch fresh news articles.
  • Relational Database Service (RDS): Optimized MySQL with higher performance than what we would see if we were to install MySQL on our EC2 instance.
  • Identity and Access Management (IAM): Security controls that allow AWS resources to communicate with each other, and with the public internet.

We also wanted to decentralize our deployment process and allow anyone in our team to deploy code to production. We set up additional AWS services to enable automatic code deployments whenever we push to GitHub:

  • CodeDeploy: Continuous deployment manager; receives code and installs it on our EC2 instance.
  • CodePipeline: Watches GitHub for changes and automatically triggers a deployment with CodeDeploy whenever new code is pushed to master.
  • Simple Storage Service (S3): Temporary storage of code archives.