OMSCS — Machine Learning
Overview
This class is extremely polarizing among students. Undoubtedly, it has a unique grading structure that can be very frustrating for those expecting traditional breakdowns for what scores constitute an A versus a B. While explicit grading rubrics are not shared for the papers you write, Professor Isbell is very open about how he has structured his class and what his grading philosophies are. Regardless of the scores you receive on Canvas (they will likely be much lower than you are used to), as long as you stick with it and try your best, you‘re almost guaranteed a B. More likely, you will get an A. And while it may be hard to believe at times, trust that the class has been designed in every way to give you the benefit of the doubt. As long as you’re trying your best to learn something, everything will be okay.
Okay, with that said, it can still be disheartening to see that you have a 60% average by the time the withdrawal deadline comes around. Here’s some practical tips that will convince you that it’s okay.
General Gameplan
- Multiple and repeated consumption of the material (like in most classes) is the key to success. Lectures are king; watch them more than once. George’s notes are a nice and quick refresher. Many students skip the Mitchell textbook, but it’s pretty short and digestible.
- The slack channel tends to be a vibrant hub, if that’s your thing.
- Turn in the optional homework. It can be tempting to skip, but there’s little opportunity cost since they are not graded and they may end up helping you bump up a letter grade.
- As mentioned before, do not sweat getting ‘bad’ scores.
Papers
- The number one mistake students make is trying to get ‘good’ results for their paper. Good results do not matter. Good results is time consuming — it’s why students claim to spend 30+ hours a week on this class. Good results are subjective to the point of being meaningless.
- Analysis is all that matters. The instructors are very clear that this is the focal point of the papers. It is truly shocking how many students come to a revelation on piazza/slack/omscentral along the lines of “I figured out that when I thought about my results and analyzed them, I got much a better grade” as if it was a big secret that wasn’t literally in the directions of the assignment.
- So now comes the question of what makes good analysis. Office hours is where you’ll find your answer and per the syllabus, are actually ‘required’ viewing. They are in a unique format — students post questions in a dedicated piazza thread the week leading up to office hours and then the head TAs will go through them one by one and answer them. In the ‘first week’ of an assignment (technically, all assignments are available from the start, but there is a general cadence when students actually start each one) here are some good questions that somebody (hint: it can be you) should ask:
“Can you give a high level overview of what you expect for this assignment?”
“What kind of experiments should we run? What are the types of charts and visualizations that you expect?”
“What are some common mistakes that you’ve seen students make? What are some things we should avoid?
- After those general high level questions are out of the way, remember that you can ask any question, no matter how general or specific, and they will provide very detailed answers. This is yet another good reason to start early, since you won’t be left waiting on an answer to a question for the office hours right before the deadline.
- Students complain that the assignments are too vague with an ambiguous rubric. While they are definitely open-ended, there’s a lot more direction than you might first think. The assignment directions are written in a uniquely informal way, without a list of bulleted requirements like you may be used to in other classes. They have a bunch of questions written in paragraph form that you may even mistake for being rhetorical questions. Do not be fooled. Treat every question and comment in the assignment spec as if they were hard requirements (they basically are). In your paper, be very clear where you are addressing those requirements (like with a section header). Your TA is spending 5 minutes skimming your paper and following along a rubric, so make it as easy for them as possible by making sure you are highlighting all the good stuff.
- It really bears repeating that analysis is what matters. Students often spend a lot of time running lots of experiments, trying to tune their parameters to get the best score they can, and plotting everything they can possibly think to plot. These are results; they are not analysis. Find a parameter you want to tune. See how the results change as that parameter goes up or down. Plot that general trend. Think about the data, connect it to the lecture material, and explain why you think that trend is occurring (it should not surprise you that this is the most important part). Move on. It does not matter that the model does best when alpha is exactly 0.3465. It only matters to say that as alpha goes up, your error goes down, and this is why. Keeping it simple and stupid will get you a good grade.
- The first assignment can be pretty daunting if you have never taken any ML adjacent course before. The book “Hands-on Machine Learning with Sci-kit Learn” is available on O-Reilly (which if you didn’t know, we have access to as GaTech students) and is a great resource for the practical aspects of running machine learning experiments.
- Choose simple datasets where you do not have to do any data preprocessing or cleanup, especially if you are new to running machine learning experiments. Do not choose extraordinarily large datasets. Large datasets will just mean you will do more waiting around for your experiments to run. These can be found in the UCI repository or Kaggle. You may find that a lot of your peers are using the same dataset as you and while that is probably boring for your grader, it is totally fine. Another option, which I used, is just creating your own fake data, which I did with sklearn’s ‘make_classification’ function.
- Don’t stress too much about what makes a dataset ‘interesting.’ They need to be noisy enough so that your classifiers don’t classify them perfectly. They need to be different enough so that your two datasets get different results with the same classifier. This is most data. For example, I chose a dataset that had many features with relatively few samples, and another with relatively few features and many samples. Those differences can make for an interesting comparison.
- Steal as much code as you can. Really, it’s okay.
- Only after you’ve done the above (and I know, it sounds like a lot) and kept it simple and stupid is when you should feel free to try anything you want. Get creative. Do as many experiments as you want. You can even completely swap out datasets if you don’t like what you’ve got (you should already have the code structure to do this pretty easily). Students who try doing this too early on will be the ones who scramble to meet the deadline.
Exams (But, really, the midterm)
- Like many aspects of this course, the exams are such a unique experience that can leave you feeling broken. No matter what, it really will be okay.
- For whatever pedagogical reason, the midterm is set up to be an extreme time crunch on purpose. It is way too long with way too little time. Time management will be extremely important.
- Do not spend too much time on the true/false section. Each can be answered with one or two sentences. If you’re writing more than that you are doing it wrong.
- Certain questions are worth more than others. Be mindful of that. Get points where you can.
- If you do not immediately know the answer to a question, skip it and move on to the next one. There is not enough time for you to be caught up thinking about a question.
- Don’t word vomit. Just like you don’t want to spend too much time thinking about any one question, you don’t want to waste time writing everything you can think of in hopes of getting more points. The graders will be very good at spotting that. If Professor Isbell hadn’t emphasized it enough, clear and concise synthesis is all you need.
- Lectures are the best material to study for the exam. Watch them multiple times and take notes. Think about potential questions you could be asked. Do the optional problem set.
- The final intentionally has more time and fewer questions. Most students don’t feel pressed for time for the final.
Should I drop? Can I recover from this grade?
- You should not drop and you can recover from your grade.
- Professor Isbell strongly believes in gradients. If you are on the border line of a letter grade, there’s a general non-decreasing trend in your assignment scores, and you turned in the optional problem sets, he will bump you up a grade.
- If your final grade is better than your midterm grade (and recall, the final is way easier) he will replace your midterm grade with your final grade.
- Assignment 4 is intentionally easier and worth more points than assignments 2 and 3. Do with that information as you will.
- Per his words “Doing all the assignments tends to be the difference between A and B students.” You hear that? Just turn everything in.