An algorithm that relies on inverse document frequency (IDF) and keyword matching to provide relevant suggestions during reply construction

 In search for a method to decrease time spent responding to emails, Georgia Tech inventors have developed DejaVu, an algorithm that relies on inverse document frequency (IDF) and keyword matching to provide relevant suggestions during reply construction. DejaVu’s system architecture includes its client, information curator, and its information database. The DejaVu client is located on the user’s smartphone and interacts with the Information Curator on the cloud. The Information Curator maintains the Information Database and retrieves suggestions from it. 

The Information Curator parses emails in their entirety and stores them in the Information Database. Each entry in the database is indexed by a set of keywords extracted from it. The text of an entry in the information database is initially converted to lowercase and then split into constituent words, W(i). To capture the core context of the text in the index and to avoid duplicates of words that are close in meaning with each other, every word in W(i) is trimmed to its root. Each entry in the information database is then indexed on the set of roots of words in W(i). The email whose suggestions are to be extracted is parsed and the core context in the form of a list of keywords is extracted from it. The Information Curator then matches the keyword with the index of entries in the information database. The DejaVu client then uses a hybrid push/pull model for retrieving suggestions from the Information Curator.

Solution Advantages
  • Potential to decrease time spent replying to emails
  • Reply suggestions are relevant due to robust information curator and database in the system architecture
Potential Commercial Applications
  • Any email client-  Gmail, Outlook, Apple Mail, etc.- The user using these email clients can subscribe to the reply prediction service. It can also be provided as a stand lone cloud service to the users
Background and More Information

In spite of many other emerging means of communication such as social networking platforms, instant messaging (IM), and mobile IM, emailing continues to remain the most pervasive form of communication within enterprises. The number of emails sent/received per day within enterprises is expected to grow to 139.4 billion emails by the year 2018. The average enterprise employee sent/received 126 emails per day in 2015. This deluge of emails results in an average enterprise worker spending 28% of their work time in reading and responding to emails, a substantial fraction.