Using other people's code
Learning to program in any new context is a zig-zag affair. You need to read and play with sample code, read technical documentation, build new projects on top of starter templates, talk to other people both more and less experienced than you, etc. Often, you need to struggle on your own to fully internalize a concept or technique; other times, you just need somebody to show you.
Sometimes when you are solving a computer programming problem, you find help in the form of somebody else's code. But what code is acceptable to use in your CS class assignments and personal projects, and what responsibilities do you accept by using it?
This document will provide a partial answer for an academic context, but the questions are also important for professional programmers.
Academic integrity at Carleton
Carleton has a policy on academic integrity. This policy applies to all your academic work at Carleton, so you should familiarize yourself with it. Really, read it right now—it's only 490 words.
Licenses
Many modern code reference websites use some variant of the Creative Commons licenses, which are designed to be enforceable in as many countries' legal systems as possible. For example, the license known as Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) currently covers the code that people post in discussions on Stack Overflow.
Suppose you choose to adapt, build upon, or copy some code that is covered by one of the "Attribution-ShareAlike" licenses. If you distribute your resulting code (e.g. hand it in as a class assignment, post it online, sell it to a customer, or email it to a friend), you take on responsibilities delineated in the license. See each specific license for complete details, but for most of the Creative Commons licenses, your main responsibilities are:
- You must provide "appropriate credit" to the author of the code
- You must indicate whether you have made changes to the shared code
- You must distribute the adapted code under the same license as the original code
There's lots more to it, but that's the gist. Check out the Creative Commons website for a wealth of additional information.
(Note that this stuff can get very complicated. For example, what happens if you adapt chunks of code from more than one author under more than one license? Distributing your adaptation then involves you in multi-licensing issues. Here are some relevant discussions.)
There are many other software licenses in common use, each of which grants some rights and imposes some responsibilities. For example, there's the MIT License, the Apache License, the GNU Free Documentation License, and the GNU Public License. As a general rule, it's important to pay attention to what license covers any code you plan to copy or adapt within your own software.
Additional issues in CS class
When you're writing software for a college class and you want to use or adapt somebody else's code, there are some additional things to consider. For example:
- Problems from a textbook. If you're working on a problem or project associated with a popular textbook, students at other schools have probably posted complete solutions for the problem. Your best choice in this situation is to not even look at such solutions, much less borrow code from them. This is the best strategy not just because of academic integrity issues; it's also best if you wish to learn the course material. (Also, many posted solutions are badly written, incorrect, or even dangerous to run.)
- Code from other students. Can you use code written by a classmate? What if the student took the class some other term? What if the student is a lab assistant just showing you how to do something? Does it matter whether it's a couple lines of code versus a whole function versus a whole file? These are tricky questions with no fixed answers. To get greater clarity, pay close attention to...
- What does your professor say? Most professors post online information about their policies for collaboration and code-sharing. Read that information carefully for every class, even if you have taken a class from this professor before.
Rough guidelines
The final word on what's acceptable in class comes from your professor and the college's policies. But here are some guidelines for your behavior that are likely to serve you well as minimum requirements no matter what class you're in.
- You need to understand your code. No matter what source your code comes from, it's essential for you to understand how it works. This is true in an educational context, where learning is the whole point. But it is also true in a professional development context, where insidious bugs are often introduced by developers who are in a hurry and don't think through the implications of their code.
- Think before you search. Often, you can solve a programming problem more effectively by thinking with a pad of paper or writing a little experimental code than by searching the internet. Struggling is an essential part of the process of learning, so don't cut off your learning prematurely by diving right into online discussions.
- Don't borrow much. The amount of code you obtain from other people matters in an academic context. Very small chunks of code on a narrowly targeted topic are usually going to be OK (e.g. "I didn't remember Python's regular expression syntax for grouping, so I took these two lines from a blog"). But when you start copying and editing more than four or five lines from a single source, you should hesitate and see if you can solve the same problem without that source's help.
- Attribute everything. Did you copy or adapt code from somebody else, even just a line or two? Provide a brief explanation and a link to the original code. Use phrases like "Taken from" or "Adapted from" to indicate whether you have changed the original code. Attributing everything is both the right thing to do and significant protection against academic integrity violations.
- Provide a citation close to the borrowed code. Provide attribution close to the site of the code you have borrowed or adapted. For example, if you took a couple lines of code from a blog and reworked them for your context, put a comment just above those lines or at the top of the relevant function. Include a link, of course.
- Don't worry about overdoing attribution. Did the professor give you starter code? There's no harm and a lot of benefit in saying so in the comment at the top of your source code: "Adapted from sample code provided by Professor XYZ".
- Longer projects: consider a credits section in a readme file. Putting all your external sources in one place can be helpful to your reader, and goes a long ways towards meeting your license obligations. Do this in addition to the localized attribution described above.
- Code from other students. Expectations about sharing code between students will vary a lot from professor to professor. The safest approach is to limit your discussions with other students to general approaches to the problem, not sharing or examining each other's code.
- Samples from official documentation. Many official technical reference sites (e.g. the Python documentation) include sample code. These samples, properly cited, are generally fair game for your use.
- What about Stack Overflow? Your professors certainly make use of this powerful resource themselves, and most of them are likely to allow you to make judicious use of Stack Overflow discussions in your work. That said, it's essential to limit the amount of code you borrow in total and from any given source. If you find yourself using more than, say, four lines of code from a single Stack Overflow discussion or code from more than two Stack Overflow discussions for a single assignment, it's time to talk to your professor about the scope of acceptable code borrowing and how to become less dependent on online sources.
- What about LLMs like ChatGPT and GitHub Copilot? As with Stack Overflow, you should not be using more than a small handful of lines of code generated by AI-based code-generating assistants like ChatGPT and Copilot. Early evidence from ChatGPT suggests that it is capable of remarkable feats of code generation that are also wrong a substantial amount of the time. There's also a study released in late 2022 suggesting that current code assistants write a lot of insecure code even when it's functionally correct. Keep your eyes open, and don't lose sight of the "you need to understand your code" item at the top of this list.
- Don't forget: there's a lot of bad code online. Way more frequently than you might think, the code you find online is inefficient, inelegant, insecure, or simply wrong. You're in school to deepen your own understanding, so use your programming assignments as an opportunity to develop your own judgment.
- Apply the smell-test. Are you uneasy about using a piece of code that comes your way? That's the perfect time to have a conversation with your professor before proceeding.