We've built a powerful news personalization system, specifically focused on professionals. It's like google news, twitter, and youtube all rolled into one and customized just for your career.
We use it to power a subscription business where customers pay us $5 a month for a personalized email digest and newsreader apps. We also create customized company newsfeeds and white-labeled newsletters at a (much) higher price.
Our customers are mainly from the consulting, strategy, and financial services sectors.
We've built our content recommendation system using Elasticsearch and a proprietary taxonomy and ranking / scoring system over the past four years.
There is still a large manual component that we're looking to potentially automate. We've manually tagged over 300k articles and would like to look into training the system off of this data. We've also recently incorporated IBM Watson concept mapping into the algorithm, as well as would be looking to further integrate user-level data (clicks) into the system.
This would definitely be more of a "small data" project, but we've collected a very focused content dataset over a number of years specific to strategy / innovation / technology and hope it could be an interesting project.
Ideal candidate should be interested in extracting meaning from text / NLP or solving textual content problems, and we're open to junior-level candidates.