3.2 - Self-perpetuation of relevance
After having analysed the implicit and explicit bias in the creation of algorithms, as well as issues of learned bias and prejudice in algorithmic automated fine-tuning, we’ll now turn towards how effects of algorithmic categorization influence the information the public is exposed to. This section will focus on the tendency of algorithmic processes to self-perpetuate: autonomously create relevance based on algorithmic choices of prioritization of content, followed by feeding those filtered results back into their own processes, creating a closed amplification loop of prioritization of information.
The presentation of algorithmic results is designed for self-amplification
Algorithms are meant to filter information, making it easier for human beings to more easily get to results that are most relevant to them. This section focuses on the choices made in how results are presented, and how those choices influence the audience in the perceived importance of results. No matter how strong the efficiency and accuracy of algorithms is, they cannot influence an individual’s capacity to parse and absorb the presented information. Algorithms can sort and filter — it is up to user interface designers and information architects to make the information understandable and accessible to the public. We will not analyse the importance or merit of quality in user interface design and information architecture within the scope of this paper. However, it is important to mention that there are strong user interface constraints to how information can be presented. Take the Google search engine for example: it is designed to present results in separate pages, averaging ten results per page. This is a user interface choice, but also a presentation constraint of the online medium. The rate of users accessing pages other than page one drops precipitously: a 2009 research found that:
Thirty-six percent of the participants did not go beyond the first three search results (12% went beyond the first three results in all information assignments). Ninety-one percent did not go further than the first page with search results in all the assignments. (Van Deursen & Van Dijk, 2009, p. 6)
We can assume that the choice of limiting results for a given number on the first page makes the perception of quality of those results disproportionately higher than search results on subsequent pages - even if presumably the difference in relevance is similar between the second to last and the last result on page one, as between the last result on page one and the first result on page two.
This phenomenon, that this paper will refer to as the presentation cut-off point, means that through conscious choices in limiting presentation of results, the cut-off point disproportionately amplifies the results that manage to make the cut — which means that users are more likely to interact with those results, and in turn boost even more strongly how algorithms perceive their relevance to a certain subject. Algorithms, as a rule, consider user interaction as a relevant piece of information when assigning weight to content (as seen in the preceding chapter). By enacting a cut-off point on the presentation of information, algorithm owners increase the probability that users will interact with that content: in turn, the information about user interaction is captured by algorithms to boost even more the weight of the presented content. This is the self-perpetuating loop of relevance that influences which information will be presented to the audience far beyond the initial choice of presentation constraint.
Similar self-amplification takes place in those information platforms that implement trends or trending topics. Trending topics are usually extrapolated from hashtags, represented by strings of characters preceded by a pound key (#). Hashtags are a way content producers have to codify an additional layer of meaning to a piece of information, used to signify that an article, post or tweet is part of a larger conversation. Content platforms like Facebook and Twitter (the platform that gave birth to the hashtag phenomenon) analyse the usage of hashtags and flag those hashtags that are being used by a certain number of users in a certain period of time as “trending” topics. The platforms then proceed to present lists of trending topics back to the audience as a technique of exposing and defining the conversation zeitgeist. It is important to note here that the exact algorithmic method of calculating trending topics is itself a black box: in fact, during the Ferguson incident, the main trending topic on Facebook was the Ice Bucket Challenge (Hern, 2014) - even if it’s safe to assume that Ferguson, compared to the Ice Bucket Challenge, probably had a similar number of mentions on Facebook as on Twitter at the same time.
As with the cut-off point in internet searches, the trending topics themselves act as a self-perpetuating limiter of information: an algorithmic black box proceeds to analyse the conversation zeitgeist and presents the results back to the audience, which in turn amplifies the importance of the visualized trending topics and confirms their relevance in an act of self-fulfilling prophecy.
The final aspect of self-perpetuating relevance of algorithmic content filtering is the relationship between amplification and relevance of content. By having little or no qualitative opinion, automated trends are at risk of amplifying based on quantity instead of quality, at the expense of other potentially relevant topics. With humans as gatekeepers, we can assume that there is a purpose in the agenda setting process — with all the caveats seen in chapter 2. This argument isn’t evoking the spectre of censorship (chapter 3.3 will focus on that); rather, it sheds a light on the fact that weighting conversation by volume instead of meaningfulness is another crucial aspect of how algorithms are modifying the information we as the public are exposed to. The risk of amplifying irrelevance are understood by algorithm owners, who proceed to tweak and modify their algorithms to try to surface meaningfulness: Twitter, for example, decided to modify how trending topics calculated importance in order to specifically reduce relevance hijacking by specific very vocal groups of users: one famous case is Justin Bieber, whose fans accounted for 3% of overall Twitter traffic, which spurred Twitter to recalculate how trending topics are presented, in order to avoid Justin Bieber being constantly positioned on top of the list (Gillespie, n.d.)(“Twitter on Twitter,” n.d.).
The Justin Bieber example, however, shows that manual and purposeful modification of algorithmic results does happen, and there’s indication that it happens in other information platforms: see for example the 2016 case in which an ex-Facebook employee made allegations that the company was purposefully censoring conservative news from their (algorithmically driven) trending news section (Bowles & Thielman, 2016). The active manipulation of algorithmic trends under the control of algorithm owners is particularly important for some of the factors mentioned before, like explicit bias and focus on return of investment, as well as the lack of moral, ethical and legal mechanisms that regulate how algorithms are built and implemented. These factors indicate that in addition to automated filtering processes,
There is also another dimension that influences which information makes it through the algorithmic gates: active restriction of content.