Codegraphy Project

Examples of relational data visualizations: CodeFlower is a D3.js module for drawing a file dependency graph; CodeCity represents classes as buildings and packages as districts laid out in a grid, color-coded with code metrics; and Circos is a tool for visualizing annotated relational data laid out on a circle.

Examples of relational data visualizations: CodeFlower is a D3.js module for drawing a file dependency graph; CodeCity represents classes as buildings and packages as districts laid out in a grid, color-coded with code metrics; and Circos is a tool for visualizing annotated relational data laid out on a circle.

Lately I've been doing a bit of research to find out what kind of code metrics are commonly used to better understand the structure and health of a codebase, and what tools exist for visualizing those metrics. It's a pretty vast subject (I've probably only scratched the surface in my research), but I'll try to give a summary of my findings so far, and sketch out what I hope to tackle in this area.

I would like to build a tool to visualize the relational structure and informational flow in a large-scale iOS project. I'm a very visual thinker who likes to gain a big-picture understanding of things, which I find can be difficult to do when joining on a new project with a large codebase. It'd be very helpful if there were ready-made tools available for Objective C to visualize a call-graph or dependency matrix of the code, color-coded with metrics like lines-of-code, cyclomatic complexity, test-coverage, or modification activity, which would help identify hot spots of potential code smells and make better-informed iterative architectural decisions. 

Although code metric and visualization tools do exist for statically typed languages like C# (e.g. Visual StudioNDepend) and Java (e.g. Sonargraph, JArchitect), as well as dynamically typed languages like python (e.g. Radon, Python Call Graph), Ruby (e.g. Code Climate), and Javascript (e.g. JSComplexity, Code Climate), there seems to be a dearth of such tools in the land of iOS (see this Wikipedia page for a list of static code analyzers, which these tools are typically built upon). For Objective C I have unearthed a couple of tools that look worth investigating further: SonarQube is a multi-language platform for managing code quality that has an Objective C plugin. I also came across a blog post that describes how to set up iOS code metrics in Jenkins. There is also this python script for generating an import dependency graph.

Given that a visualization tool for Objective C code structure and metrics doesn't exist (at least not in a form that I have in mind), I've begun to explore what it would take to build one. The first ingredient I'll need is a tool for parsing through code and generating the relational graphs I would like to visualize. The clang compiler has an API library in C called libclang that can be used to parse C/C++/Objective C code into an abstract syntax tree (AST) structure, as well as process such structures. There is also a convenient python binding for libclang (for a helpful reference, see this blog post).

So, the first step in creating a visualization tool is to use libclang to process all of the Objective C code in a project into a graph data structure (or dependency matrix). But what defines this structure? What are the nodes and links? Depending on the analysis, one could consider a node to be a file, a class, or perhaps even an object. A link corresponds to some kind of directional relation between nodes, such as when a file depends on another file, or a class calls a method in another class, or an object is injected into another object, either via constructor or method injection. I've begun to explore these different possibilities and likely more than one will turn out to be useful.

The next major step after building a relational structure will be to calculate various code metrics, such as lines of code (LOC), complexity, and code coverage, which can be incorporated via various graph element stylings, such as node size and color. Aside from the usual basic metrics, it would be interesting to consider ways to quantify properties such as code coupling and cohesion, within the source code and between source and tests, to get a sense of how flexible the code is to modification. 

The final form of this tool will most likely be a D3.js driven interactive web page. I've come across some existing code that should serve as useful references, such as CodeFlower and DependencyWheel (which is similar to a Circos visualization). I'm also intrigued by the CodeCity project, which is based around a city metaphor, representing classes as buildings. I wonder how far one could take that metaphor, perhaps superimposing transit-like network structures to represent the flow of data through the system. 

RF Proximity Spikes

StickNFind, Gimbal, and RedBearLab beacons use Bluetooth LE technology to detect the proximity of a mobile device.

StickNFind, Gimbal, and RedBearLab beacons use Bluetooth LE technology to detect the proximity of a mobile device.

For more than a year I've been scoping out potential devices I could use for RF proximity sensing, a technology that is becoming increasingly mainstream since Apple embraced it with their Bluetooth LE-based iBeacon spec and developer API (subtly introduced in iOS 7 last summer). The basic premise is that a very small device (a 'beacon') with a radio frequency antenna can periodically emit a signal that broadcasts a unique identifier assigned to the device, and a mobile device such as an iPhone may then detect that signal when it is within range. By gauging the signal strength, the mobile device can estimate how close it is to the beacon. The Bluetooth LE specification is well suited for this type of application, and several manufacturers had already started bringing their devices to market before Apple formally released their proprietary iBeacon specification earlier this year. 

One of the earliest devices in this space was the StickNFind beacon, released commercially over a year ago with the targeted use case of finding lost objects. The company quickly began offering developer support in the form of an iOS SDK and a developer forum. I purchased 10 StickNFind beacons in bulk, for $15 per unit, last summer and began trying out their SDK shortly thereafter. Overall, I found their developer support left much to be desired: new versions of the SKD were released infrequently and via email rather than through a hosted web site. I found the signal strength of the beacons to undergo rather wide fluctuations, which made it difficult to use to detect proximity. After several months I gave up on their platform.

By the end of 2013, Qualcomm unveiled their Gimbal proximity beacon platform with full iOS and Android developer support. Upon registration to their developer program, they shipped three of their beacon devices for free to help get started. So I began experimenting with their devices and feature-rich SDK earlier this year. I found the iOS-facing API to be well architected and easy to use. In particular, the key ingredient for my use case is the ability to continuously monitor the signal strength of the beacon when the app is backgrounded, which they provide a call-back method for. Once they made the devices available for purchase, at an astonishingly low cost of $5 per unit, I obtained enough to actually start to deploy in my home for testing. I'm not yet fully committed to Gimbal, but so far it seems like the most promising option.

Another strong contender in this space that I've only recently begun to explore is the offering by RedBearLab. These devices were quite a bit more pricey at $25 per unit (prices vary based on purchase size), but what I find compelling is that they've made their platform open, and they support the iBeacon spec. I believe this would allow more under-the-hood customization than offered by Gimbal, but with the obvious downside of requiring more up-front investment in building out the API.  

Hopefully I'll find time to explore these devices so I can commit to one or the other and move to the next stage of my project.

WorldLine Project

A visualization of daily activity data taken with a prototype iPhone app I built called HabStats.

A visualization of daily activity data taken with a prototype iPhone app I built called HabStats.

For some time I've been fascinated with the notion of recording my daily activities. I recall in 2001 as a postdoc in Toronto installing a time tracking app on my Palm Pilot (something like this), and using it vigilantly to keep track of how I was making use of my time. I felt empowered to actually have hard data, which often contradicted my subjective sense of time. I also realized that the act of recording what I was doing increased my in-the-moment awareness, but at the same time was tedious. After a month or so I eventually abandoned the practice.

Years later, after the iPhone had been released upon the world, my interest in time tracking was rekindled, and I began to devise what such a re-envisioned time tracking app would look like on the iPhone. I eventually purchased an iPhone in 2009, and an Apple developer account, largely driven by my desire to build this tool. So, in my spare time, I learned Objective C and taught myself how to use the iPhone SDK. It took a couple of years of starts and stops, but eventually I had a working app I called HabStats, with plans to release to the App Store. However, after using the app to  continuously track my activities for a week in March of 2012, I became discouraged by the tediousness and intrusiveness of having to interact with my phone throughout the day. I put my project aside once again, later giving a summary of it as a show and tell presentation at the Chicago Quantified Self meetup.

The project never died, but has pivoted in a new direction. To alleviate the insurmountable tediousness of having to manually record activities, I began exploring the possibility of using GPS, motion sensors, and RF proximity sensors to build up a data set that could be cultivated for human activity information. This is becoming an increasingly crowded problem space. There already exist mobile apps, such as Moves, that track your movements using GPS. Coupled with Apple's iBeacon platform, and motion trackers like the Fitbit, I think it will be possible to pinpoint one's activities fairly accurately. This is what I'm currently working on, rechristening the project WorldLine. Hopefully I'll be able to make progress in the coming months.