Literature search
When you have an idea of some software you want to build, or a software system you want to assemble, it's really easy to try to just build it from scratch out of your own brain. This is almost always a mistake.
Even if you're going to build an entire product yourself, it's important to know what other people have done when faced with similar goals and problems. By knowing what other people have done, you can take advantage of their insights, steer around pitfalls, and build something better and more interesting than you would build by just using your own experience as a guide.
With that in mind, it's time to do a literature review.
Identify your research questions
First step is to come up with some questions for which the available literature might provide you with insight. Here are some examples of the kind of question I'm thinking about.
- What techniques have people developed for identifying malicious browser extensions? (And how well do those techniques work?)
- Same questions, but for DDoS mitigation.
- What kinds of software systems use YARA rules? What techniques have been used to develop useful YARA rules?
- What malware taxonomies exist, and how do they map to famous examples?
- What are the best techniques for setting up a malware sandbox environment in which it is safe to run malware? And what kind of data collection can you do in such environments?
- ...
These examples are a good start, but for any given project, you'll also want to think
What to do
- Identify a small collection of research questions (say, 2-4 questions, depending on your project).
- For each question, find resources that address the question. These might be academic papers, white papers from companies, blog posts, software repositories, videos, etc. If each team member spends 2-3 hours on this task, you should end up with a pretty long list of resources (certainly in the dozens).
- Curate your list. Some of the stuff you find will be not very interesting or useful. Some of it will be ground-breaking and widely used in the field. Some resources won't break new ground, but will be good at explaining stuff that's relevant to your project. Cluster your resources by question and apparent value to help you decide which resources to read carefully, and which ones to skim or discard.
- Based on the resources you find, put a simple annotated
bibliography in your team's repository. Call it
bibliograhpy.md or .docx or .pdf, though .txt is OK too.
Break your bibliography into one section for each research
question. For each resource, include:
- a very brief description (typically 1-2 sentences)
- enough information to enable somebody to find the resource (mostly, this will just be title + author + link)
- your thoughts, if any, about the likely applicability of the resource for your project
- Option: you can, as suggested above, organize your report by research question. But if the resources don't nicely fall into that kind of organization, feel free to organize your bibliography in a way that feels more natural and helpful to you.
- Suggestion: create a directory in your repository named "planning", and "git mv" your original project ideas document plus your bibliography in there. There will be at least two more drafts of planning documents before we're through, so it would be nice to have all of that stuff in one place instead of littering the top level of your repo.
- Note: I don't expect you to have fully read and understood all of your resources by the time you turn in this bibliography. Your goal, rather, is to figure out which resources you really need to focus as you get started on your project.