Loading…
Activate 2018 has ended
Thursday, October 18 • 10:30am - 11:10am
Vectors in Search – Towards More Semantic Matching

Log in to save this to your schedule and see who's attending!

With the advent of deep learning and algorithms like word2vec and doc2vec, vectors-based representations are increasingly being used in search to represent anything from documents to images and products. However, search engines work with documents made of tokens, and not vectors, and are typically not designed for fast vector matching out of the box. In this talk, I will give an overview of how vectors can be derived from documents to produce a semantic representation of a document that can be used to implement semantic / conceptual search without hurting performance. I will then I will describe a few different techniques for efficiently searching vector-based representations in an inverted index, such as learning sparse representations of vectors, clustering, and learning binary vectors. Finally, I will discuss some of the pitfalls of vector-based search, and how to get the best of both worlds by combining vector-based scoring with traditional relevancy metrics such as BM25.

Speakers
avatar for Simon Hughes

Simon Hughes

Chief Data Scientist, Dice.com
Simon is currently the Chief Data Scientist at Dice.com, the technology professional recruiting site. He is also a PhD candidate at DePaul university, studying a PhD in machine learning and natural language processing. At Dice, he has developed multiple recommender engines for matching... Read More →


Thursday October 18, 2018 10:30am - 11:10am
Drummond East, Level 3