-
Each student will be randomly assigned 2 topics, one about NLP and one about Python.
-
The main focus will be on the NLP topic, but sufficient knowledge (at least that equivalent to a passing grade) must be demonstrated in both topics to pass the exam.
Due to the current situation, all exams will be held on Microsoft Teams. This means:
- there is no preparation time after you are assigned topics,
- you must have your camera turned on,
- you must be prepared to share your screen and start writing code. We recommend that you already have Jupyter or Colab running when you start the exam.
Date | Time | Location |
---|---|---|
May 25 (Tue) | 10:15 | Teams |
Jun 01 (Tue) | 10:15 | Teams |
Jun 08 (Tue) | 10:15 | Teams |
- NLP tasks
- Tokenization, lemmatization, POS tagging, language modelling
- Text representations
- One hot encoding, TF-IDF
- Feature vectors, classification pipeline, logistic regression
- Problems with one-hot encoding
- Creating word embeddings, cosine similarity of word vectors
- Using word embeddings in NLP tasks
- Learning types (supervised, unsupervised, classification, regression, clustering)
- Constructing train, validation and test sets
- Terminology
- Loss function
- Batch size
- Epoch
- Learning rate
- Feed forward neural networks
- Neurons, activation functions, softmax
- Difference between feed forward and recurrent neural networks
- 3 basic types
- Example applications
- What are the sequence elements? Pros and cons.
- Padding and batching
- Attention
- Basic idea not the exact formulation
- Transformer
- Motivation
- Basic components
- Positional encoding
- Contextualized embeddings
- BERT tokenization
- BERT components
- Finetuning
- Applications
- Why do we use other metrics than accuracy?
- Evaluation of universal dependancy trees
- F-score
- ROUGE
- What is in a dependency tree?
- What is CoNLL-U?
- Learning dependency parsing
- Basic ideas, not step-by-step
- What is Jupyter?
- Cell types, cell magic
- Kernel
- Short history of Python
- Python community, PEP, Pythonista
- args, kwargs
- Default arguments
- Lambda functions
- Generators, yield statement
- Static vs. dynamic typing
- Built-in types (numeric, boolean), operators
- Mutability
- list vs. tuple
- Operators
- Advanced indexing
- Extra: time complexity of basic operations (lookup, insert, remove, append etc.)
- Set type and operations
- Character encodings: Unicode vs. UTF-8, encoding, decoding
- Common string operations
- String formatting (mention at least two kinds)
- Data attributes, methods, class attirbutes
- Inheritance,
super
- Duck typing
- Magic methods, operator overloading
- Assignment, shallow copy, deep copy
- Object introspection
- Class decorators, static methods, class methods
- Properties
- Basic list comprehension (you should be able to write one)
- Generator expressions
- Extra: iteration protocol, writing an iterator class
- Set and dict comprehension
yield
keyword
- Motivation
- Basic keywords
- Defining exception classes
- Motivation
- Basic usage
- Defining context managers
- What are decorators?
@wraps
decorator- Decorators with parameters
- Classes as decorators
- map, filter, reduce
ndarray
- Defining ndarrays (mention at least 3 functions)
- array attributes
- Indexing and advanced indexing
- Operations an arrays
- Mention at least 5 operations
- Extra: broadcasting