PC: People mentioned 1) siloes within and between the academy and other industries; 2) that finding resources and mentorship for maintaining and producing qualitative data is scarce. The tour of the library and the difficulty in maintaining the archives (despite all the fantastic work they are doing) was revealing… Though part of this is of course infrastructural (i.e. the library floods…)
PC: One dominant frame/concern is that of Western/foreign intrusion-- “data colonialism.” Though I didn’t hear that term per se, the core idea was being tossed around… Do Kenyan’s own the data or do outsiders? And even more generally, I heard people leveraging their African identity with respect to data and research… I may be reading too much into this, but that seems to signal Africa in opposition to… what? Perhaps in opposition to current (US/European) centers of data gathering and analysis.
PC: One take: “in digital spaces, you can be very invisible… an anonymous African researcher.” In this sense, greater connection flattens key aspects of one’s identity while boosting other aspects (e.g. one’s university affiliation might become more important)... As a result, some kinds of (elite) identification may become elevated in importance while marginalizing others.
PC: Something that came up here in the “risks and benefits” discussion is how digitizing qualitative data opens up the risk of de-contextualized/flattened interpretation of data. Someone gave the example of historians who may no longer even need to come to Nairobi to conduct research “in the archives.”
PC: One person pointed out that Kenya has the infrastructural capacity (5 tier three data centers; M-Pesa is hosted here; Visa/bank transactions hosted here), but not necessarily the expertise to make full use of that capacity. This, however, is mostly referring to big (quantitative) data storage and analytics. On the qualitative side, some discussion came out in the panel on students… E.g. students didn’t know about the journals, what the journal admissions process looks like, where students could pull data from, where students could store data, or even access to paywalled research. So even if the infrastructure is there, knowledge about the infrastructures or ability to access it might be limited.
PC: Copyright as an issue came up. One major tension was around data ownership-- data localization was a key point that I picked up on. For example, if sensitive Kenyan data is stored on Amazon or Microsoft cloud services, then if the US decides to subpeona it, there’s nothing Kenya can do about it. (Probably a similar or even graver concern with Huawei and China.) At the same time, people mentioned that there are clear advantages to scale. “There’s a reason no one uses their company emails here… everyone will have a gmail or yahoo account.” Also, strict data privacy or localization of data infrastructures may be a barrier to data sharing / open data. How do we reconcile this tension? One thing I would note here is that most people’s concerns were dealing with digital trace data, not traditionally qualitative data. Beyond copyright issues, I don’t remember hearing too much concern with sharing qualitative data, perhaps because the infrastructure or ability to share/analyze such data at scale (for profit / control) isn’t quite as common.
PC: A key tension I see here is between open / universalized / decontextualized (typically quantitative or at the least, digital) data, and localized, particular, (often qualitative but not always) data. Open data on the one hand allows for effective data sharing, easier access to data for otherwise marginal populations, the ability to critique/monitor those in power… On the other hand, universalized data opens up populations to observation by powerful, external actors. That data is easy to port and therefore can easily be sold and used… Localized and particular data infrastructures keeps knowledge within the community (and therefore within community control), but also siloes information… A core concern here then is power… Grace makes the point down the line that there is a “huge asymmetry between the people who produce the information and the people who analyze the data.”
PC: Most everyone seemed interested… One question that I thought was especially provocative (that went somewhat unaddressed) was the question of data formatting for storage. Already, CDs, audio tapes, etc are difficult formats to retrieve data from simply because the tools to read them are less and less available. If we are thinking of long-term data storage, what are our options?