https://facebook.tracking.exposed

Torino Hacknight, Toolbox Coworking, 23/02/2017

Gilberto Conti @_bicno

FEE6694981538A34

Did you hear that Facebook is messing up with the timeline?

-- someone, in the U-Bahn

via Reuters (2016-06-25)

via TheGuardian(2014-06-29) + Paper.

via TechnologyReview

(2016-06-15)

a medium post of Zeynep Tufekci

(2014-08-14)

via Social Media Week

(2017-01-23)

via Zero Hedge

(2017-01-16)

In spite of appearances, I feel a strong esteem for Facebook as a company

This project is intended to support society, policy makers, journalists and users to get a better understanding about power dynamics in the algorithms age

Facebook has 1 billion ~700 million (maybe) users worldwide, will remain for many years a reference point and their influence can only increase

Society was not so ready, probably?

Collective behaviour

  • It is driven by Social policies
    • Tools for the collectivity to do self-governance and self-awareness
    • Intended to dissuade harmful behaviour and incentivate useful behaviour

    Social policies have to be openly discussed

Personalisation algorithms are a tool of Social Politics!

problems with that?

  • Collectivity has no accountability on the social outcome
  • is simply powerless
  • Currently private company are driven by business interest determining what it is better and what it is not
“These algorithms, when they are not transparent, can lead to a distortion of our perception, they narrow our breadth of information.”

"distortion of perception", aka

  • Filter bubble
  • Echo chambers
  • Algorithm discrimination
  • (Technocratic scapegoat ?)

The problems:

  • Am I in the bubble?
  • How much the bubble is big?
  • There are other bubbles?
  • As the bubble influences me?

Let's start from a post

In order to see the effects of a personalisation algorithm:

after September, a small team is born around the project, and we took steps to reach a beta phase.

the project

Our vision is to increase transparency behind personalization algorithms, so that people can have more effective control of their online Facebook experience and more awareness of the information to which they are exposed.

How does it work?

  • Facebook users who joined our project are named Supporters. It is an explicit opt-in, you need to install a browser extension do be part of it.
  • the Extension, monitors and scrapes the Supporter feed, when it finds a public shared post, it collects the HTML section, and sends it to a server
  • All other content, for example the one shared with friends only, are ignored
we don't want to risk to leak activities intended for a restricted audience, because we might have no control over the dataset.

how to keep supporters engaged?

we can provide informations Facebook cannot!

  • Escape the filter bubble
  • Understand your personal trends
  • Understand you friends influence
  • To see the Human networks you belong

more happy users, more data for researcher!

Personalisation algorithms influence the collectively, and collectivity have to be analyzed

Have some fresh/new users behaving in the same way, might facebook treating them equally? which are the parameters that influence more?

💡 Can you emulate an high income Luxembourg user and compare with a low income granpa near the German border? (different browser)

Alpha - Goal (Set 2016)

  • Show a concept of “monitoring the social media from the user point of view” tool.
  • Envision a project able to engage with a diverse audience.
  • Attract attention in the technical and human rights defender community in order to raise criticism of the social media power.
  • Explore ways to make the power of the algorithms, otherwise invisible and still greatly debated, “visible”.

Alpha - Stack

  • TamperMonkey (a framework for web extension). The userscript parses the facebook posts and extracts metadata.
  • A server based on nodejs and mongodb. Server receives metadata extracted by the userscript.
  • A set of analysis function to analyse the dataset derived from the collection. The function is accessible via web API, and the content visualized via webapp (d3.js)

Alpha - Considerations

  • Promoted posts and Feed posts have a different nature. The former do not have a publication date. Problem: if promoted post has to be considered along with the feed, ignored, or analyzed separately for a different purpose.
  • Facebook changes the HTML format quite frequently. As a technical mandate, we’ve to keep our dataset clean.

Beta - Goal (Dec 2016)

  • Privacy statement (Greg McMullen): a clear statement concerning the data we manage, how we manage them, and how we deal with user's privacy.
  • Archiving: row data from extensions
  • Distributed analysis: The process metadata extraction can be outsourced in development and in computation power.

Beta - Goal

  • Enabling skilled adopters: some simple and anonymized API
  • Sharing agreement: The goal is to set the appropriate ethical standard under which stakeholders can freely obtain access to our database in order to perform advanced researches. Avoiding SOCMINT

Beta - Stack

  • Public/private key schema for contributors’ authentication.
  • API for distributed data mining. They have been already used by others people (within the framework, or in python to do natural language processing).
  • Selecting javascript framework for simple visualization and third party inclusion (beside d3.js, c3.js heatmap-cal and datatables.net are used right now).

FAQ

  • Often the project facebook.tracking.exposed has been mistaken as a tool that might offer a different experience of facebook directly: it is not, we'll work only passively on the timeline of the user.
  • Can Facebook block the plugin? well, for sure they can give us trouble, but we are not using the API, so also their way to control our usage has to be targeted against the project.

FAQ

  • Information diet: we oppose to the idea that technology might tell you what is a fake news, a validated news, a trusted source, etc. Technology can’t replace critical judgment. Our goal is to compare the information you are exposed during your facebook experience, with a larger and differentiated set of sources. The user would decide if he’s happy with what is exposed on facebook or rather whether he thinks he should enlarge his information base.

Interfaces

  • Provide interfaces with actual features, features that facebook can't provide, is the strategical way to get users feel a beneficial effects from adopting the tool
  • RealityCheck

API

Time of activity of the supporter

Absolute order post list

Parsers

Dataset

Running parsers

  • postType
  • promotedTitle
  • promotedLink
  • promotedInfo
  • feedPostHref
  • feedReactions
  • imageTag(?)

and now?