flow :
dataset (berupa text deskripsi kalimat pendek)
-> Preprocessing (Case folding, Tokenizing, Stopword)
-> word embedding (BERT)
-> Perhitungan jarak (Cosine Similarity, Jaccard, Euclidean, Manhattan)
-> Precision, Recall, F-score
note
- semua based on Deep Learning (bukan Machine Learning)
- word embedding yang digunakan wajib menggunakan Library Gensim
- RUMUS (Precision, Recall, F-score) wajib pakai perhitungan information retrieval dari sumber ini:
https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Precision
- project menggunakan python (.ipynp)
- dataset (30% testing, 70% training)











Loading ...
