Google Research

The CSSLab has received a $20,000 research grant from Google Research for its project “Modeling YouTube Users’ Production & Consumption of News,” an ambitious project to illuminate the supply-demand mechanics of the YouTube news ecosystem. It is part of the CSSLab’s Penn Media Accountability Project (PennMAP), an interdisciplinary, nonpartisan research program dedicated to enhancing media transparency and accountability at the scale of the entire information ecosystem.

Aided by Google Research’s funding, the Lab will enrich the study of the growing YouTube platform by analyzing a corpus of more than 5 million video transcripts and developing state-of-the-art machine learning techniques to quantify the amount and types of news produced and consumed at the video level. This corpus is made up of YouTube API data from every video produced by ~17K YouTube channels, along with the browsing records of over 300,000 YouTube users from a nationally representative US desktop web panel, over the course of six years. Taken together, these data can shed valuable light on the feedback loop between news consumption and production, where increased viewership of a certain type of content drives channels to saturate the news space with similar content. Google Research’s grant will enable the design, construction, and maintenance of the large-scale research infrastructure and databases crucial to this work.

The impact of the Lab’s research extends beyond developing new machine learning techniques and providing a benchmark for content analysis. The CSSLab emphasizes the design, implementation, and promotion oof open science practices, including transparent research protocols, pre-registration and replication, data and code sharing, and updating refindings. In addition to accelerating the rate of knowledge acquisition, the resulting mass collaboration model will help to broaden participation to researchers who would otherwise lack the resources to participate in cutting-edge CSS and user modeling research.

Related: Penn Media Accountability Project (PennMAP)

PennMAP is an interdisciplinary, nonpartisan research project dedicated to enhancing media transparency and accountability at the scale of the entire information ecosystem.

Solution-driven interdisciplinary connections

The project was presented at Google’s virtual Workshop on Action, Task and User Journey Modeling, held October 5-6, 2022. As part of the workshop, eligible faculty had the opportunity to submit a proposal to be considered for an unrestricted gift to be used towards related research.

The Workshop on Action, Task and User Journey Modeling was created to bring together key researchers from academia and Google to exchange ideas and forge new collaborations. It virtually convened researchers inside and outside Google from multiple fields, such as Natural Language Processing, Recommendation Systems/User Modeling, and Large-scale Graph Mining, as well as experts in topics such as consumer psychology and behavioral economics. These researchers presented state-of-the-art research relevant to the understanding of the actions, tasks, and journeys that people perform online. It was conceptualized as an opportunity to cross-pollinate ideas, to establish new collaborations, and for external researchers to gain insight into Google’s unique challenges and perspectives on these problems, e.g. applying large foundational models to task modeling or privacy-preserving approaches for understanding user journeys at scale.

About Google Research

Google Research maintains a portfolio of research projects driven by fundamental research, new product innovation, product contribution and infrastructure goals, while providing individuals and teams the freedom to emphasize specific types of work. They strive to create an environment conducive to many different types of research across many different time scales and levels of risk.

In recent years, computing has both expanded as a field and grown in its importance to society. Similarly, the research conducted at Google has broadened dramatically, becoming more important than ever to Google’s mission. As such, their research philosophy has become more expansive and now incorporates a substantial amount of open-ended, long-term research driven more by scientific curiosity than current product needs.

Google Research’s goal is to create a research environment rich in opportunities for product impact, to build a product environment that actively benefits from research, and to provide their staff the freedom to work on important research problems that are not tied to immediate product needs. Some of the most exciting research enables new products, or even new businesses, that we cannot imagine today. Given the diversity of research projects that they pursue, Google Research has found it useful to define four types of work to help crystalize the goals of projects and allow them to measure progress: basic research and fundamental applied research, new product innovation, critical product contributions, and infrastructure.

Today, Google Research actively pursues advancement across 23 core research areas. Google Research teams aspire to make discoveries that impact everyone, and core to their approach is sharing their research and tools to fuel progress in the field. Their researchers publish regularly in academic journals, release projects as open source, and apply research to Google products.


Emma Arsekin

Senior Communications Specialist