- This awesome website indexes all 11,762 movies since 1970.
- By default the interface shows 40 random movies.
- If you click on any movie, it will show you the 40 movies that are most similar to it.
- Similarity is calculated based on each movie's Wikipedia Summary + Plot
- Embeddings can be either tfidf or ada. tfidf are simple bigrams, ada are OpenAI Embeddings, specifically the text-embedding-ada-002 endpoint. ada should be more high-level/semantic similarity. tfidf should be more around the individual word use.
- Ranker can be either k-Nearest Neighbor (kNN) using cosine similarity, or an Exemplar Support Vector Machine (SVM)
- You can also search for movies by their title with the search box.
- Just a fun weekend hack.