Natural Language Processing

This course offers an introduction to Natural Language Processing and its applications. NLP lies at the intersection of linguistics (the study of language), programming, mathematics, and artificial intelligence. In this course, we will study models covering both written and spoken language, with applications including machine translation, question answering, dialog systems, etc. NLP relies heavily on concepts from linear algebra, probability, and calculus in combination with machine learning techniques. You should be prepared to get your hands dirty with mathematical analysis, programming, and data processing. While we will go over generative models including GPT, there is a lot more to NLP, and a significant portion of the course will be spent covering those topics. Tentative lecture topics include:

  • Language Models (of all sizes)
  • Probabilistic methods & models
  • Part-of-speech tagging, Lexicons, Text classification
  • Vector semantics & word-embeddings
  • Topic modeling
  • Machine translation
  • Information-extraction


Instructor

Raj Venkat
r.venkatesaramani@northeastern.edu

Office Hours: Meserve 303, Thursday 3:00 - 4:30 pm
To request appointments outside of office hours, click here.
If you decide to swing by on a whim, and my office door is open, feel free to bug me.


Syllabus

Click here to download the syllabus.

When & Where

Lectures 2:50 - 4:30 pm, Mon & Wed Shillman 135
Questions 24/7 Campuswire (Group Join Code - 9007)
Written Submissions until 11:59 pm, due day Gradescope
Code Submissions until 11:59 pm, due day Varies, refer to assignment instructions


Textbook & References

Primary Text: Speech and Language Processing (3rd ed. draft, free online), Dan Jurafsky and James H. Martin

Additional References:
Statistical Machine Learning, Ch. 13: Neural Machine Translation (draft, free online), Philipp Koehn
Natural Language Processing (free online), Jacob Eisenstein, MIT Press
Mathematics for Machine Learning (free online), A. Aldo Faisal Deisenroth, and Cheng Soon Ong. Cambridge University Press


Grading

Grades will be based on the following split over course load:
  • Assignments: 40% (5 assignments, NOT equally weighted)
  • Project: 40% (20% for presentation, 20% for final report)
  • Labs: 20% (combination of in-class problems, and additional problems requiring self-study)


  • Final grades will be assigned based on the following scale (note open and closed intervals):
    A [93, 100]
    A- [90, 93)
    B+ [87, 90)
    B [82, 87)
    B- [80, 82)
    C+ [77, 80)
    C [72, 77)
    C- [70, 72)
    F [0, 70)

    Natural rounding will be used, i.e., percentages $\ge x.5$ get rounded up to the next integer, $x + 1$ (94.5 becomes 95, 94.4 does not).
    I reserve the right to curve grades at the end of the semester. While not guaranteed, if a curve is applied, it will necessarily be in students’ favor.

Policies

Homework Submissions

  • Written submissions & labs must be submitted to Gradescope by 11:59 pm Eastern on the due date. Written submissions must be in PDF format.
  • All programming submissions must be pushed to the respective portal (refer to assignment instructionsx) by 11:59 pm Eastern on the due date.
  • All written solutions must be typeset (i.e., no scans of handwritten assignments will be accepted). The use of LaTeX is highly encouraged, but not mandatory. Overleaf is an excellent browser-based LaTeX editor with real-time compilation capabilities. Once you've created an account, you might find this template useful to get started on Overleaf. You'll need to create a copy of the template in your Overleaf projects to get edit access.
  • The submission portals will remain open a few hours after the deadline. I will download homework submissions once I wake up the day after the due date, and whatever version I download is the one that will be graded. This extra time window is intended only so as to not penalize technical difficulties that may occur around the actual midnight deadline, and should not be treated as an automatic extension.
  • It is encouraged that you work with your classmates on the homework problems, but keep discussions at a conceptual level. If you do collaborate, you must write all solutions by yourself, in your own words, and are strictly forbidden from sharing any written solutions or code. You must list all of your collaborators on your submission. The TAs and instructors reserve the right to ask you to explain your solutions.
  • Grading All grades, including programming assignments, will be released via Gradescope.
  • Regrade Requests All regrade requests must be submitted within 1 week of receiving your grade. Requests for all submissions must be submitted from within Gradescope. Requests submitted via email will almost certainly be missed.

Policy on the use of Generative AI

  • For programming submissions, students may freely use any AI tool that is available to them, but must cite any such use. I believe that it is imperative that students learn how to use these tools effectively and correctly, while also learning how to properly test and verify code generated by these tools. I recommend not using generative AI to assist with labs.
  • Students may not use generative AI to complete the written portions of homework assignments. The written portions are intended to test students' ability to demonstrate mastery of techniques learnt in the course by a) presenting sound theoretical analysis, and b) critically analyzing their code. Any indication of the use of generative AI in the written submissions will constitute a violation of academic integrity.
  • The use of AI to rephrase sentences and improve writing clarity, etc. - while an acceptable use case - makes it very difficult for instructors to discern whether or not the entire answer was AI-generated. Therefore, if AI is used in this manner, students are required to submit an additional appendix (instructions will be provided as part of the assignments) with their corresponding originally written answers. The submission of such an appendix is aimed only at helping me understand the usage patterns of AI tools. Points will not be taken off for using AI to improve writing.

Late Policy

  • As a general rule, no late submissions are accepted. However, each student is given one 'freebie' - a no-questions-asked one-week extension to a single homework of their choice. The freebie is intended to be a fallback in case of genuine emergencies where coordinating with the instructor may not be feasible. Be wise in how you use this. The freebie may not be used for labs, presentations, or submissions related to the final project.
  • If, when using the freebie, a student submits within 48 hours of the original deadline, the student may then use the remaining 5 days on a different homework submission. In order to make logistics feasible, the remainder may not be split a second time.
  • Once the freebie is used in either manner, I will generally not grant further extensions, except in the case of limited and verifiable emergency situations, or University and DRC-sanctioned accommodations. It is imperative that you communicate with me early on if circumstances permit. Timely submissions are the only way for me to get you timely feedback.
  • In case you have exhausted your freebie, and feel like you will be unable to submit an assignment in time, reach out to me. Depending on your circumstances, I may not give you an extension, but I will certainly offer you the right resources to help you make the best of your assignment. My only goal is to help you succeed.

Academic Integrity

  • Please familiarize yourself with Northeastern University's Academic Integrity Policy
  • Sharing of code in any form (including posting on Campuswire) is strictly forbidden. Searching for solutions online is okay, with appropriate citations in code comments. You may not ask TAs or the instructor to help debug code that was found online.
  • Any violation of academic integrity (as outlined by homework policies above) will result in an OSCCR report being filed against you.
  • Additional academic penalties, including but not restricted to failing the course without an option to withdraw may be levied against you at the discretion of the instructor.
  • Recognize that most violations are often easily avoided by simply acknowledging any difficulties you may be having with the course, and seeking help from your instructors in a timely fashion. We're here to help you learn.

Classroom Environment

To create and preserve a classroom atmosphere that optimizes teaching and learning, all participants share a responsibility in creating a civil and non-disruptive forum for the discussion of ideas. Students are expected to conduct themselves at all times in a manner that does not disrupt teaching or learning. Your comments to others should be constructive and free from harassing statements. You are encouraged to disagree with other students and the instructor, but such disagreements need to respectful and be based upon facts and documentation (rather than prejudices and personalities). The instructor reserves the right to interrupt conversations that deviate from these expectations. Repeated unprofessional or disrespectful conduct may result in a lower grade or more severe consequences. Part of the learning process in this course is respectful engagement of ideas with others.

TA Team & Office Hours

Campuswire, the platform we're using for 24/7 Q&A, also has built-in video chatrooms with automatic queue management, which will be used for any listed virtual office hours. TAs may offer a mix of in-person and hybrid office hours; timings and locations will be updated here.

Alberto Mario Ceballos Arroyo
ceballosarroyo.a@northeastern.edu
Office Hours: Mon, Wed 12-2 (Cahners 003),
Mon 6-8, Campuswire
Sai Thanishvi Daruru
daruru.s@northeastern.edu
Office Hours: Tue, Fri 1-3, Thu 9-11, Campuswire
Divyadarshini Muruganandham
muruganandham.d@northeastern.edu
Office Hours: Thu 3-5, Fri 10-12, Sat 10-12, Campuswire
Bhargav Naga Shyam Pabbaraju
pabbaraju.b@northeastern.edu
Office Hours: Tue, Thu, Fri, 5-7, Campuswire
Tejodhay Bonam
bonam.t@northeastern.edu
Office Hours: Mon, Wed, Fri, 10-12, Campuswire


Campus Resources

Healthcare, Counseling, and Wellness

Your health and well-being are paramount, above any and all course deliverables. There is a wide range of support services on campus to ensure your success, and I encourage you to reach out to resources as appropriate.

University Health and Counseling Services
Find@Northeastern - 24/7 Mental Health Support
WeCare
Support Groups and Workshops


Title IX

  • Title IX of the Education Amendments of 1972 protects individuals from sex or gender-based discrimination, including discrimination based on gender-identity, in educational programs and activities that receive federal financial assistance. Northeastern’s Title IX Policy prohibits Prohibited Offenses, which are defined as sexual harassment, sexual assault, relationship or domestic violence, and stalking. The Title IX Policy applies to the entire community, including students, faculty and staff of all genders.
  • If you or someone you know has been a survivor of a Prohibited Offense, confidential support and guidance can be found through University Health and Counseling Services staff and the Center for Spiritual Dialogue and Service clergy members. By law, those employees are not required to report allegations of sex or gender-based discrimination to the University.
  • Alleged violations can be reported non-confidentially to the Title IX Coordinator within The Office for University Equity and Compliance by filling out the online Discrimination Complaint Form, emailing the OUEC (less secure) at: titleix@northeastern.edu and/or through NUPD (Emergency 617.373.3333; Non-Emergency 617.373.2121). Reporting Prohibited Offenses to NUPD does NOT commit the victim/affected party to future legal action.
  • Faculty members are considered “responsible employees” at Northeastern University, meaning they are required to report all allegations of sex or gender-based discrimination to the Title IX Coordinator. In case of an emergency, please call 911. Please visit the Title IX webpage for a complete list of reporting options and resources both on-campus and off-campus.


Disability Resources

Students with disabilities who wish to receive academic services and/or accommodations should visit the Disability Resource Center at 20 Dodge Hall, or call 617.373.2675. If you have not already done so, please provide your letter from the DRC to your instructor early in the semester so that they can arrange those accommodations.