Harnessing the deluge of personal data: Part 1

As I alluded to in an earlier post, I'm trying to better organize my life so that ultimately I can accomplish more of the things that make me happy. So I've been thinking more critically about ways to improve how I manage my time and information, and exploring existing ideas and tools. There are lots of parts to this problem, so it is important to try to distill out the essential bits, and develop some kind of unified conceptual understanding.

The internet is abuzz lately about the problem of data management: as computation, information, and communication technologies improve, there is increasingly more and more data available about every facet of our existence, but how do we best make use of this growing volume of data to improve our lives? A beautiful example that illustrates the key underlying patterns/principles to this problem is the emergence of the online personal finance tool Mint. When you sign up to Mint, you provide it with the login info for all of your online bank accounts (checking, savings, credit cards, etc). It then retrieves every transaction you make from these different sources and aggregates the data together to provide a unified big picture of your finances, allowing you to study spending trends, monitor your budget, or track various financial goals.

This elegant solution would not be possible without the following key technologies coming together: machine readable electronic data + internet communications + raw computational power. Personal financial data has existed for hundreds of years, but until the past few decades it has been stored on paper in books. But even after the emergence of computers and electronic records, this type of data was not so readily accessible. It is only in the past decade or so that all three of these critical components can be integrated together to open a new era for personalized computable data.

The story of Mint taming the flood of financial data is really a precursor to personalized data management on a much larger scale. Over the past few years the internet has exploded with services aimed at managing personalized data spanning a diverse collection of knowledge areas and life pursuits, for both web-based and mobile platforms. It has the feeling of an inflection point of some kind. But it is not clear what the transformative technology that enables individuals to truly harness this ever expanding volume of data will look like. Is it possible, for example, to build an all-purpose universal data management system, providing aggregation, curation, social sharing, and analysis capabilities that can meaningfully cope with such a broad spectrum of data? For example, it seems doubtful that a single service could adequately cope with both financial and medical data through a unified interface. However, it seems perfectly reasonable to devise a single platform for handling social bookmarking of movies, TV shows, books, and music.

In my next post, I will distill the problem further into the core irreducible concepts, and link to some key concrete examples that I have encountered over the past several weeks.