Important Dates

Submission Due

May 14, 2015

Deadline Extended:

May 24, 2015

Author Notification

June 4, 2015

June 11, 2015

Final Submission Due

June 21, 2015

Workshop Dates

July 30-31, 2015

Workshop News

Contact

sujatha.das@gmail.com
ude.usk@aegaracc


Workshop Abstract

The current-day Web provides access to enormous amounts of textual data. A wealth of news articles, weblogs, customer reviews, forum threads, scientific documents, and social media data are now being rapidly made available online. It is estimated that users send about 204 million messages by email and WordPress users publish about 347 new blog posts every minute. Despite restrictions on domains and document types, the amount of data currently available is overwhelming. For example, the scientific database PubMed currently has over 24 million documents in their collection while Google Scholar is estimated to have 160 million documents as of 2014.

These rapidly-growing online documents offer several benefits for discovery, learning, and staying informed. However, data mining applications are now faced with the challenge of efficiently processing more documents in less time. One approach that has been previously adopted to handle this challenge is through the use of "document summaries" or "key parts of a document" in lieu of entire documents. However, document summaries are not available directly and instead, they need to be gleaned from the many details in documents. Keyphrases of a document are considered a "micro summary" for a document and comprise the descriptive phrases or concepts extracted from a document. Keyphrases are used in a multitude of applications such as query formulation, document clustering, recommendation, and summarization, indexing, search and retrieval, tracking topics in newswire, linking Web documents to Wikipedia articles, recommending academic paper, and online advertising.