Skip to content

intsystems/NLP_Course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLP Course, 2026

NLP Logo

Current curriculum covers topic from classical NLP to modern LLM training, scaling, and deployment:

  • Foundations: text preprocessing, tokenization (BPE / WordPiece / Unigram), word representations (TF-IDF, Word2Vec, GloVe)
  • Sequence Models: n-gram & neural LMs, RNN/LSTM, seq2seq, attention, Transformer
  • Pre-trained LMs & LLMs: BERT / GPT / T5, transfer learning, prompting, CLM/MLM pre-training, scaling laws
  • Modern LLM Architecture: RoPE / ALiBi, KV cache, MHA → MQA / GQA / MLA, RMSNorm, SwiGLU
  • Training at Scale: mixed precision, ZeRO / FSDP, 5D parallelism, Mixture of Experts
  • Efficient Inference: quantization (GPTQ, AWQ, INT8/4, FP8), distillation, speculative decoding, PagedAttention
  • Applied LLMs: Information Retrieval, RAG, AI agents (ReAct, tool use, memory, MCP)
  • Post-training: alignment, RLHF, DPO

Course Staff

Materials

Week # Date Topic Lecture Seminar Additional Recording
1 February 10 Intro to NLP & Tokenization slides ipynb materials TBA
2 February 17 Feature Extraction and Word Representations slides ipynb materials TBA
3 February 24 Language Modeling, Seq2Seq, Attention slides - materials TBA
4 March 3 Transfer Learning, BERT-like, LLMs slides ipynb materials YouTube
5 March 10 Large Language Models Pre-Training slides - materials YouTube
6 March 17 Modern LLMs evolution beyond the Transformer slides ipynb materials YouTube
7 March 31 Training Large Language Models slides - materials TBA
8 April 7 5D Parallelism, Mixture of Experts slides - materials TBA
9 April 14 Efficient Inference Techniques and Methods slides - materials TBA
10 April 28 Information Retrieval & RAG slides - materials TBA
11 May 5 AI Agents slides ipynb materials TBA

Homeworks

Task # Release Deadline Inside Materials
1 March 10 March 19 - 23:59 Explore NLP pipeline ipynb
2 April 19 May 3 - 23:59 Models Fine-Tuning ipynb
3 May 9 May 17 - 23:59 AI agent systems ipynb

Game Rules

Final mark = 0.3 × (oral answer grade) + 0.7 × (average score for practical assignments)

Both oral exam and homeworks are blocking parts, you need to pass both parts to pass the course.

Prerequisities

  • Probability Theory + Statistics
  • Machine Learning
  • Python Python guide
  • Basic knowledge on NLP

We expect students to know basics of Natural Language Processing, as the course focuses on more advanced topics. When you unsure about the basics, we recommned to read these lectures / materials:

  1. Course from Lena Voita
  2. Speech and Language Processing by Jurafsky and Martin
  3. Stanford CS 224n
  4. Great blog on Transformer & BERT

About

Intelligent Systems course on advanced Natural Language Processing

Topics

Resources

Stars

Watchers

Forks

Contributors