ACM CIKM 2013 includes an Industry Track that runs in parallel with the research tracks. The industry track includes a series of invited talks given by influential technical leaders, who will present the state of the art in industrial research and development in information retrieval, knowledge management, databases, and data mining.
Andrew Ng is a Co-founder of Coursera, and a Computer Science faculty member at Stanford.
In 2011, he led the development of Stanford University's main MOOC (Massive Open Online Courses) platform, and also taught an online Machine Learning class that was offered to over 100,000 students, leading to the founding of Coursera. Ng's goal is to give everyone in the world access to a high quality education, for free.
Today, Coursera partners with top universities to offer high quality, free online courses. With over 80 partners, nearly 400 courses, and 4 million students, Coursera is currently the largest MOOC platform in the world. Outside online education, Ng's research work is in machine learning; he is also the Director of the Stanford Artificial Intelligence Lab.
Talk: The Online Revolution: Education for Everyone (Slides)
In 2011, Stanford University offered three online courses, which anyone in the world could enroll in and take for free. Together, these three courses had enrollments of around 350,000 students, making this one of the largest experiments in online education ever performed. Since the beginning of 2012, we have transitioned this effort into a new venture, Coursera, a social entrepreneurship company whose mission is to make high-quality education accessible to everyone by allowing the best universities to offer courses to everyone around the world, for free. Coursera classes provide a real course experience to students, including video content, interactive exercises with meaningful feedback, using both auto-grading and peer-grading, and a rich peer-to-peer interaction around the course materials. Currently, Coursera has over 80 university and other partners, and 4 million students enrolled in its nearly 400 courses. These courses span a range of topics including computer science, business, medicine, science, humanities, social sciences, and more.
In this talk, I'll report on this far-reaching experiment in education, and why we believe this model can provide both an improved classroom experience for our on-campus students, via a flipped classroom model, as well as a meaningful learning experience for the millions of students around the world who would otherwise never have access to education of this quality.
Chris Farmer is Founder and CEO of Signalfire an early stage venture capital and competitive intelligence company that is developing a cutting edge data analytics platform focused on the technology industry. Leveraging this platform, Signalfire is working with founders and corporate partners to build the next generation of great technology companies and to establish competitive advantage in the market.
Previously, was a Venture Partner with General Catalyst Partners where he led the seed investment program which included investments such as Alation Data, ClassDojo, Dr Chrono, GameClosure, Getaround, Ostrovok, ParElastic, Parlay Labs, Stripe, Thumb, Verticloud, Viajanet and many more. Chris was previously a Vice President with Bessemer Venture Partners where he led investments in digital media and mobile companies including Millennial Media (IPO), Enforta Wireless & Goal.com (Acq) in addition to individual angel investments in LinkedIn (IPO), Endeca (Acq), Lifelock (IPO), Yelp (IPO), Broadsoft (IPO), etc. Prior to BVP, he was a consultant with Bain & Co. where he advised on large-scale technology buyouts.
As an entrepreneur, Chris spearheaded the successful turn-around of Skybitz, a wireless enabled SaaS company, which was acquired by Telular Corporation. At Skybitz, Chris led Product Management and was responsible for finance, recruiting the executive team as well as market and business development. At SkyBitz, Chris designed products and services that drove over $35M in annual revenue and over $5M in profit. Earlier in his career, Chris founded an investment advisory firm for angel investors and private equity funds and served as a Venture Advisor to an early stage venture capital fund where he led investments in SkyBitz and Applied Semantics (acquired by Google to form the foundation for AdWords and AdSense). He started his career on Wall Street in the private equity group of Cowen & Company.
Chris received a BA in International Relations and Business from Tufts University and the Fletcher School of Diplomacy and received High Honors for his thesis on Internet Marketplaces. He has been published in Harvard Business Review, The Wall Street Journal, Institutional Investor, The Journal of Private Equity and other leading publications.
Talk: Leveraging Data to Change Industry Paradigms (Slides)
Much of the conversation on "big data" is centered on data technologies and analytics platforms and how established companies apply them. While those technologies and platforms are certainly very important for industry incumbents, data analytics is also often a key building block for new start-up entrants looking to disrupt industry verticals. In many cases, the best examples of novel applications of data to create new services and competitive advantage require a complete rethinking of organizational design in order to create feedback loops and rethink cost structures.
The company I founded, SignalFire is applying data for competitive advantage in my own industry, venture capital, but there are myriad examples of this trend across industries such as transportation, financial services, retail, media and many other markets. In this talk, I will discuss how we analyze these trends as venture capitalists and will look at a few case studies of specific companies leveraging data to innovate in their industries.
Deepak Agarwal is currently Director of Engineering at LinkedIn where he leads a group called Applied Relevance Science (ARS) whose focus is to bridge the gap between science and products by deploying cutting-edge methods to power various recommendation systems at LinkedIn. ARS members are working on computational advertising, content optimization, stream relevance, experimental design and building distributed computing methods for large scale machine learning. Previously, he was a Principal Research Scientist at Yahoo! Research where he worked on content optimization for Yahoo! media properties and computational advertising for Yahoo! Premium display and RightMedia exchange. He developed methods that were deployed in production and led to a 300% improvement in CTR on the Yahoo! Front Page. He was awarded the Yahoo! superstar award for this effort.
Deepak has published extensively in top-tier conferences like KDD, WWW, WSDM, NIPS, ICDM, ICML and others. He was program co-chair for KDD 2012 are regularly serves on various program committees. He is currently an associate editor for two flagship journals in Statistics -- Journal of the American Statistical Association and Annals of Applied Statistics.
Talk: Computational Advertising: The LinkedIn Way (Slides)
LinkedIn is the largest professional social network in the world with more than 238M members. It provides a platform for advertisers to reach out to professionals and target them using rich profile and behavioral data. Thus, online advertising is an important business for LinkedIn. In this talk, I will give an overview of machine learning and optimization components that power LinkedIn self-serve display advertising systems. The talk will not only focus on machine learning and optimization methods, but various practical challenges that arise when running such components in a real production environment. I will describe how we overcome some of these challenges to bridge the gap between theory and practice.
The major components that will be described in details include
The goal of this component is to estimate click-through rates (CTR) when an ad is shown to a user in a given context. Given the data sparseness due to low CTR for advertising applications in general and the curse of dimensionality, estimating such interactions is known to be a challenging. Furthermore, the goal of the system is to maximize expected revenue, hence this is an explore/exploit problem and not a supervised learning problem.
Our approach takes recourse to supervised learning to reduce dimensionality and couples it with classical explore/exploit schemes to balance the explore/exploit tradeoff. In particular, we use a large scale logistic regression to estimate user and ad interactions. Such interactions are comprised of two additive terms a) stable interactions captured by using features for both users and ads whose coefficients change slowly over time, and b) ephemeral interactions that capture ad-specific residual idiosyncrasies that are missed by the stable component. Exploration is introduced via Thompson sampling on the ephemeral interactions (sample coefficients from the posterior distribution), since the stable part is estimated using large amounts of data and subject to very little statistical variance. Our model training pipeline estimates the stable part using a scatter and gather approach via the ADMM algorithm, ephemeral part is estimated more frequently by learning a per ad correction through an ad-specific logistic regression. Scoring thousands of ads at runtime under tight latency constraints is a formidable challenge when using such models, the talk will describe methods to scale such computations at runtime.
Automatic Format Selection:
The presentation of ads in a given slot on a page has a significant impact on how users interact with them. Web designers are adept at creating good formats to facilitate ad display but selecting the best among those automatically is a machine learning task. I will describe a machine learning approach we use to solve this problem. It is again an explore/exploit problem but the dimensionality of this problem is much less than the ad selection problem.
I will also provide a detailed description of how we deal with issues like budget pacing, bid forecasting, supply forecasting and targeting. Throughout, the ML components will be illustrated with real examples from production and evaluation metrics would be reported from live tests. Offline metrics that can be useful in evaluating methods before launching them on live traffic will also be discussed.
Hugh Williams is the Vice President of Experience, Search, and Platforms at eBay Inc.
His team builds many of the user-facing components of eBay, including the home page, search experiences, motors, fashion, and MyeBay. He is also responsible for several of eBay's platforms, including the search engine, merchandising systems, product catalog and classification systems, geographic expansion, and data technologies.
Hugh was previously a development manager at Microsoft on the Bing search team, and prior to that an Associate Professor at RMIT University in Australia.
Hugh wrote the best-selling "Web Database Applications with PHP and MySQL" for O'Reilly Media Inc. He has a PhD in Computer Science from RMIT University, and has published more than 100 works.
His interests include information retrieval, computational biology, and baseball.
Talk: Challenges in Commerce Search (Slides)
Commerce search engines allow users to discover products, learn about them, and, importantly, make purchases. Commerce search is a challenging problem --- one that is very different to conventional text and web search. In this talk, we discuss what makes commerce search hard, how eBay has solved some of these problems, and what challenges eBay faces in the next generation of its search technologies. We also discuss the recent release of eBay's Cassini engine, share facts and figures about its scale, and outline the progress eBay has made in ranking and relevance for commerce search.
Jeff Hawkins is an engineer, serial entrepreneur, scientist, inventor, and author. He was a founder of two mobile computing companies, Palm and Handspring, and was the architect of many computing products such as the PalmPilot and Treo smartphone.
Throughout his life Jeff has also had a deep interest in neuroscience and theories of the neocortex. In 2002 he founded the Redwood Neuroscience Institute, a scientific institute focused on understanding how the neocortex processes information. The institute is now located at U.C. Berkeley.
In 2004 he wrote the book On Intelligence, which describes progress on understanding the neocortex. In 2005 he co-founded Numenta, Inc. a startup company building a technology based on neocortical theory.
In 2013, the commercial product endeavors of Numenta were rebranded as Grok and Numenta.org was created as the home and community for the NuPIC (Numenta Platform for Intelligent) Computing) open source project. Grok is building solutions that help companies automatically and intelligently act on machine generated data. It is Jeff's hope that Grok and Numenta.org will play a catalytic role in the emerging field of machine intelligence.
Jeff Hawkins earned his B.S. in electrical engineering from Cornell University in 1979. He was elected to the National Academy of Engineering in 2003.
Talk: Online Learning from Streaming Data (Slides)
High velocity machine-generated data is growing rapidly. To act on this data in real time requires models that learn continuously and discover the temporal patterns in noisy data streams.
The brain is also an online learning system that builds models from streaming data. In this talk I will describe recent advances in brain theory and how we have applied those advances to machine-generated streaming data.
At the heart of our work are new insights into how layers of cells in the neocortex infer and make predictions from fast changing sensory data. This theory, called the Cortical Learning Algorithm, has been tested extensively. We have embedded these learning algorithms into a product called Grok which is being applied to numerous problems such as energy load forecasting and anomaly detection. I will give an introduction to the Cortical Learning Algorithm including how it uses sparse distributed representations and then show how Grok makes predictions and detects anomalies in streaming data.
The Cortical Learning Algorithm is now an open source project (www.numenta.org) and I will give a brief introduction to the project.
Kai Yu is a Director of Engineering at Baidu. He is leading the Institute of Deep Learning (IDL), the Artificial Intelligence powerhouse of the company. Prior to the current position, he was a Senior Research Scientist at Siemens, and a Department Head at NEC Laboratories America. In 2011 he was also a Visiting Faculty at the Computer Science Department, Stanford University, teaching the class "CS121: Introduction to Artificial Intelligence".
He has published over 70 papers with H-index 32. Kai was Area Chairs at the top-tier AI conferences, including ICML and NIPS. Together with his co-authors, Kai has received the Best Paper Runner-up Awards at PKDD 2005 and ICML 2013. His NEC team has won several prestigious technology competitions, including PASCAL VOC 2009 and ImageNet 2010. Since April 2012, Kai has led a team of engineers and scientists at Baidu, to push the boundary of speech recognition, image search, deep learning, online advertising, and music search. His team has made significant contributions to a wide range of core products, from mobile search to advertising. One of his groups has been awarded the highest recognition of Baidu in 2013. He received his BS degree and MS degree in Electrical Engineering at Nanjing University, China, and a PhD degree in Computer Science at University of Munich, Germany.
Talk: Large-scale Deep Learning at Baidu (Slides)
In the past 30 years, tremendous progress has been achieved in building effective shallow classification models. Despite the success, we come to realize that, for many applications, the key bottleneck is not the qualify of classifiers but that of features. Not being able to automatically get useful features has become the main limitation for shallow models. Since 2006, learning high-level features using deep architectures from raw data has become a huge wave of new learning paradigms. In recent two years, deep learning has made many performance breakthroughs, for example, in the areas of image understanding and speech recognition. In this talk, I will walk through some of the latest technology advances of deep learning within Baidu, and discuss the main challenges, e.g., developing effective models for various applications, and scaling up the model training using many GPUs. In the end of the talk I will discuss what might be interesting future directions.
Kevin Murphy is a research scientist at Google in Mountain View, California, where is working on information extraction and probabilistic knowledge bases.
Before joining Google in 2011, he was an associate professor of computer science and statistics at the University of British Columbia in Vancouver, Canada. Before starting at UBC in 2004, he was a postdoc at MIT. Kevin got his BA from U. Cambridge, his MEng from U. Pennsylvania, and his PhD from UC Berkeley.
He has published over 50 papers in refereed conferences and journals related to machine learning and graphical models, as well as an 1100-page textbook called "Machine Learning: a Probabilistic Perspective" (MIT Press, 2012), which is currently the best selling machine learning book on Amazon.com. Kevin is also the (co) Editor-in-Chief of the Journal of Machine Learning Research.
Talk: From Big Data to Big Knowledge (Slides)
We are drowning in big data, but a lot of it is hard to interpret. For example, Google indexes about 40B webpages, but these are just represented as bags of words, which don't mean much to a computer. To get from "strings to things", Google introduced the Knowledge Graph (KG), which is a database of facts about entities (people, places, movies, etc.) and their relations (nationality, geo-containment, actor roles, etc). KG is based on Freebase, but supplements it with various other structured data sources. Although KG is very large (about 500M nodes/entities, and 30B edges/relations), it is still very incomplete. For example, 94% of the people are missing their place of birth, and 78% have no known nationality - these are examples of missing links in the graph. In addition, we are missing many nodes (corresponding to new entities), as well as new types of nodes and edges (corresponding to extensions to the schema).
In this talk, I will survey some of the efforts we are engaged in to try to "grow" KG automatically using machine learning methods. In particular, I will summarize our work on the problems of entity linkage, relation extraction, and link prediction, using data extracted from natural language text as well as tabular data found on the web.
Rich Caruana is a Senior Researcher at Microsoft Research.
Before joining Microsoft, Rich was on the faculty at the CS Department at Cornell University, at UCLA's Medical School, and at CMU's Center for Learning and Discovery (CALD).
Rich's Ph.D. is from Carnegie Mellon University where he worked with Tom Mitchell and Herb Simon. His thesis on Multi-Task Learning helped generate interest in a new subfield of machine learning called Transfer Learning.
Rich received an NSF CAREER Award in 2004 (for Meta Clustering), best paper awards in 2005 (with Alex Niculescu-Mizil) and 2007 (with Daria Sorokina), co-chaired KDD in 2007 (with Xindong Wu), and serves as area chair for NIPS, ICML, and KDD.
His current research focus is on learning for medical decision making, web ranking, adaptive clustering, and computational ecology.
Talk: Clustering: Probably Approximately Useless? (Slides)
Clustering never seems to live up to the hype. To paraphrase the popular saying, clustering looks good in theory, yet often fails to deliver in practice. Why? You would think that something so simple and elegant as finding groups of similar items in data would be incredibly useful. Yet often it isn't. The problem is that clustering rarely finds the groups you want, or expected, or that are most useful for the task at hand. There are so many good ways to cluster a dataset that the odds of coming up with the clustering that is best for what you are doing now are small. How do we fix this and make clustering more useful in practice? How do we make clustering do what you want, while still giving it the freedom to "do its own thing" and surprise us?
Ted Dunning is the Chief Application Architect at MapR Technologies.
Ted has held Chief Scientist positions at Veoh Networks, ID Analytics and at MusicMatch, (now Yahoo Music). Ted is responsible for building the most advanced identity theft detection system on the planet, as well as one of the largest peer-assisted video distribution systems and ground-breaking music and video recommendations systems.
Ted has 15 issued and 15 pending patents and contributes to several Apache open source projects including Hadoop, Zookeeper and Hbase™. He is also a committer for Apache Mahout.
Ted earned a BS degree in electrical engineering from the University of Colorado; a MS degree in computer science from New Mexico State University; and a Ph.D. in computing science from Sheffield University in the United Kingdom. Ted also bought the drinks at one of the very first Hadoop User Group meetings.
Talk: Which Learning Algorithms Really Matter (Industrially)? (Slides)
The set of algorithms that matter theoretically is different from the ones that matter commercially. Commercial importance often hinges on ease of deployment, robustness against perverse data and conceptual simplicity. Often, even accuracy can be sacrificed against these other goals. Commercial systems also often live in a highly interacting environment so off-line evaluations may have only limited applicability. I will describe several commercially important algorithms such as Thompson sampling (aka Bayesian Bandits), result dithering, on-line clustering and distribution sketches and will explain what makes these algorithms important in industrial settings.
Xavier Amatriain is Director of Personalization Science and Engineering at Netflix, where he leads the work on the next generation of recommendation algorithms. He is working on the cross-roads of machine learning research, large-scale software engineering, and product innovation. Previous to this, he was a Research Scientist and Professor focused on Recommender Systems and neighboring areas such as Data Mining, Machine Learning, and Multimedia. He has authored more than 50 papers in books, journals and international conferences. He has several patents, and has served in many program committees. Among others, he was general co-chair for the 2010 ACM Recommender Systems Conference.
Talk: Beyond Data: From User Information to Business Value Through Personalized Recommendations and Consumer Science (Slides)
Since the Netflix $1 million Prize, announced in 2006, Netflix has been known for having personalization at the core of our product. Our current product offering is nowadays focused around instant video streaming, and our data is now many orders of magnitude larger. Not only do we have many more users in many more countries, but we also receive many more streams of data. Besides the ratings, we now also use information such as what our members play, browse, or search.
In this invited talk I will discuss the different approaches we follow to deal with these large streams of user data in order to extract information for personalizing our service. I will describe some of the machine learning models used, and their application in the service. I will also describe our data-driven approach to innovation that combines rapid offline explorations as well as online A/B testing. This approach enables us to convert user information into real and measurable business value.
A lot of great people helped nomination of great speakers for CIKM Industry Track. We appreciate their help and contribution.