Managing #Data for Visual Analytics

“In database research, the big data issue has mainly been addressed as a scale issue: storing and providing the same level of services as before in terms of speed, reliability, interoperability and distribution.The scale challenges are now being solved by research and industry. A less advertised issue raised by the increase of available data is the unprecedented opportunity for discoveries and exploratory studies. For example, Metagenomics is a research ???eld in Biology that sequences DNA from random samples of material collected in the wild to discover the largest possible diversity of living species. With new high-throughput sequencing technologies, the genome of millions of species are sequenced and stored in databases but are not systematically explored due to the lack of appropriate tools. Bank transactions are being logged and monitored to ???nd patterns of frauds, but new schemes arise continually that need to be discovered in the billions of transactions logged daily. This task of discovery of innovative fraud schemes is not well supported by existing tools either. Similar problems arise in a wide range of domains where data is available but requires speci???c exploration tools, including sentiment analysis on the Web, quality control in Wikipedia, and epidemiology to name a few. However, current database technologies do not meet the requirements needed to interactively explore massive or even reasonably-sized data sets”

A philosopher of science???s response to the challenge of #bigdata biology

“Big data biology—bioinformatics, computational biology, systems biology (including ‘omics’), and synthetic biology—raises a number of issues for the philosophy of science. This article deals with several such: Is data-intensive biology a new kind of science, presumably post-reductionistic? To what extent is big data biology data-driven? Can data ‘speak for themselves?’ I discuss these issues by way of a reflection on Carl Woese’s worry that “a society that permits biology to become an engineering discipline, that allows that science to slip into the role of changing the living world without trying to understand it, is a danger to itself.” And I argue that scientific perspectivism, a philosophical stance represented prominently by Giere, Van Fraassen, and Wimsatt, according to which science cannot as a matter of principle transcend our human perspective, provides the best resources currently at our disposal to tackle many of the philosophical issues implied in the modeling of complex, multilevel/multiscale phenomena”

Data-driven sciences: From wonder cabinets to electronic databases #bigdata

“Even by the journal’s own standards, this was a wild claim. In July 2008, Wired magazine announced on its cover nothing less than “The End of Science”. It explained that “The quest for knowledge used to begin with grand theories. Now it begins with massive amounts of data”.1 Such claims about the emergence of a new “data-driven” science in response to a “data deluge” have now become common, from the pages of The Economist to those of Nature.2 Proponents of “data-driven” and “hypothesis-driven” science argue over the best methods to turn massive amounts of data into knowledge. Instead of jumping into the fray, I would like to historicize some of the questions and problems raised by data-driven science, taking as a point of departure the three rich papers by Isabelle Charmantier and Staffan Müller-Wille on Linnaeus’ information processing strategies, Sabina Leonelli and Rachel Ankeny on model organisms databases, and Peter Keating and Alberto Cambrosio on microarray data in clinical research”

Shift to comput social science #bigdata

“The era of big data has created exciting new opportunities for research to achieve high relevance and impact amid changes and transformations in how we study social science phenomena. With the emergence of very large-scale data collection techniques and the related new technological support, there seem to be fundamental changes that are occurring with the research questions we can ask, and the research methods we can apply. The contexts include social networks and blogs, political discourse on the Internet, corporate announcements and digital journalism, mobile telephony and digital home entertainment, online gaming and social shopping, and social advertising and social commerce – and much more. The increasingly advantageous costs of data collection, and the new capabilities that researchers have to conduct research that leverages the spectrum of micro-, meso, and macro-level data suggest the possibility of a scientific paradigm shift toward computational social science with big data. The new thinking related to empirical regularities analysis, experimental design, and longitudinal empirical research further suggests that these approaches can be especially tailored for rapid-acquisition big data contexts that involve new ways for researchers to achieve frequent, precise and meaningful observations of real-world phenomena. We discuss how our philosophy of science should be changing in step with the times, but argue against the assertion that theory no longer matters” 


On #BigData Algorithms

“The extensive use of Big Data has now become common in plethora of technologies and industries. From massive data bases to business intelligence and datamining applications; from search engines to recommendation systems; advancing the state of the art of voice recognition, translation and more. The design, analysis and engineering of Big Data algorithms has multiple flavors, including massive parallelism, streaming algorithms, sketches and synopses, cloud technologies, and more. We will discuss some of these aspects, and reflect on their evolution and on the interplay between the theory and practice of Big Data algorithmics” 


Collaborative Big Social Data #Bigdata

[…] “Big Data” becomes “Big Social Data” when it arises as a result of human-tohuman interaction, and herein lies the key to unlocking important insights about social processes operating at a worldwide scale as they unfold over time, the potential for which is greater than ever before because of the ubiquitousness of social media. Interactions can be of many kinds (conversation, exchange, response, relationship), and observed at the individual (survey response, votes, purchases), group, organization, and nation (trade, conflict, population movements) levels.  When people interact through web, mobile device and distributed sensors, digital traces of these interactions are left behind. These historic interactions  become more easily quantifiable through digitization and sharing of document and image archives. As a consequence, we face a transformative and disruptive data deluge, from which new scientific, economic, and social value can be extracted”…

#BigData Viz and Philosophy of Science

“As data-intensive and computational science become increasingly established as the dominant mode of conducting scientific research, visualisations of data and of the outcomes of science become increasingly prominent in mediating knowledge in the scientific arena. This position piece advocates that more attention should be paid to the epistemological role of visualisations beyond their being a cognitive aid to understanding, but as playing a crucial role in the formation of evidence for scientific claims. The new generation of computational and informational visualisations and imaging techniques challenges the philosophy of science to re-think its position on three key distinctions: the qualitative/quantitative distinction, the subjective/objective distinction, and the causal/non-causal distinction”…

Big data and urban human mobility

“The modeling of human mobility is adopting new directions due to the increasing availability of big data sources from human activity. These sources enclose digital information about daily visited locations of a large number of individuals. Examples of these data include: mobile phone calls, credit card transactions, bank notes dispersal, check-ins in internet applications, among several others. In this study, we consider the data obtained from smart subway fare card transactions to characterize and model urban mobility patterns. We present a simple mobility model for predicting peoples’ visited locations using the popularity of places in the city as an interaction parameter between different individuals. This ingredient is suf???cient to reproduce several characteristics of the observed travel behavior such as: the number of trips between different locations in the city, the exploration of new places and the frequency of individual visits of a particular location”

Big Data and Marketing Metrics (2013)

“Recent empirical studies show that successful companies are distinguished by their ability to use “big data” for strategic decision-making at senior-level (Brown et al., 2011; LaValle et al., 2011; Manyika et al., 2011; Shah et al. 2012). What is missing is a study that thoroughly explores and defines the phenomenon of metrics in the context of “big data” and that provides a holistic investigation of the interdependent role of marketing metrics and financial metrics for senior-level management within the current information and technological landscape. A research agenda is suggested to study whether senior-level managers are guided by a set of marketing metrics or whether the traditional financial metrics still dominate in organisations. In particular, this agenda explores five research challenges that deserve the attention of current and future marketing research”