Using machine learning and data scraping tools, computer scientists at the New York University Tandon School of Engineering released the first database and analysis of political advertising based on more than 884,000 ads identified by Google, Twitter, and Facebook.
The team launched their Online Political Ads Transparency Project in July. The project is focused on improving the transparency of online political advertising. The goals are to build tools to collect and archive political advertising data.
Today's report is the first to include not only Facebook (including Instagram), but data newly shared by Twitter and Google.
Although the team found numerous roadblocks to meaningful transparency - ranging from faulty archives constructed in haste by the social media giants to varying definitions of "political advertising" and throttling of data collection by Facebook - NYU Tandon Computer Science and Engineering Assistant Professor Damon McCoy and his team nonetheless reported meaningful insights. According to the report:
- President Donald Trump and his PAC registered the largest number of ads of any candidate, due in large part to the preponderance of small, micro-targeted advertising. Virtually all were aimed at raising funds during the study period, September 9-22, 2018. The researchers found similar dominance by President Trump in their initial, Facebook-only, analysis.
- The Democratic candidate for Senate from Texas, Beto O'Rourke, continued to be the apparent largest spender, mostly seeking small donations from outside his state via Facebook and Twitter. Although O'Rourke was the rare federal candidate unaffiliated with a PAC, he was like other candidates in using social media to raise funds outside their districts, McCoy noted.
- The Senate Leadership Fund, a Republican Super PAC, was the largest spender on Google and across all three platforms combined.
Priorities USA, a left-leaning PAC, was among the big spenders, but exact figures are not available because it collaborated on ad placements with other PACs.
- Left-leaning organizations are the big spenders on Facebook and Twitter; on Google, the trend is reversed.
- Facebook apparently carries the most political ads, but Google apparently ranks higher in impressions and spending. This is due, in part, to the large number of small, micro-targeted ads on Facebook (60 percent) and because the majority of spending on Google (61 percent) is by PACs, which are more like to have large budgets. But analysis is muddied by the fact that both Google and Facebook disclose only ranges; only Twitter discloses exact spending and impressions. Each of the giants also defines "political advertising" differently. For example, Facebook alone includes non-media for-profit companies promoting slanted political content, companies selling merchandise with political messages, and solar panel firms with environmental messages. Google and Twitter, meanwhile, limited their reporting to only federal candidates, at least initially.
- PACs accounted for 23 percent of the spending on Facebook during the study period.
- The very top spenders during the study period on Facebook, though, were Facebook itself and its own Instagram - Facebook to publicize its responses to Russian election hacking and Instagram to spread a get-out-the-vote message. But the researchers pointed out that the company seemed to overcharge itself, based upon impressions.
McCoy conceived the project to build easy-to-use tools to collect, archive, and analyze political advertising data. Although Facebook became the first major social media company to launch a searchable archive of political advertising, for both Facebook and Instagram, in May 2018, McCoy found the archive difficult to use, requiring time-consuming manual searches. He decided to apply versions of the data scraping techniques he had previously used against criminals, including human traffickers who advertised and used Bitcoin.
Despite the difficulty the team subsequently encountered accessing Facebook data, they report it has by far the most comprehensive political archive among the three social media companies. The report outlines problems with the API - an interface with other platforms - introduced in beta form by Facebook to allow researchers access to its archives.
Google's data is the easiest for the public to access, as a BigQuery dataset, available in its entirety via the Google Cloud service. But it is updated in real time, with no archiving, so the NYU researchers are capturing the data daily, to share and archive.
Twitter has no easily accessible political ad archive, so the NYU research team is scraping all political advertising data identified by Twitter and sharing and archiving for the public, as well.
Although the researchers used the September period for comparison purposes, they have now compiled data from late May through October 3, with a gap of about six weeks while Facebook blocked its data scraping.
You can visit the project and download data at: https://online-pol-ads.github.io.