I just found a program called Grand Perspective
that present your disk usage as an interactive mipmap (see pic on right). Helping web nerds save hard drive space isn't finding hidden heart defects or keeping planes in the air, but I was struck by how well this program demonstrates the power of intelligent data exploration tools. Here are the Tufte criteria
for information presentation:
Documentary · Comparative · Causal · Explanatory · Quantified · Multivariate · Exploratory · Skeptical
Each box is a file, and each top-level directory takes a continuous rectangular portion of the view. Scanning a 350GB disk with a /lot/ of tiny files (5+ million for just the far top left corner, the MLB gameday dataset) took < 5 minutes. You may highlight any box in a segment and navigate "down" to make that segment fill the screen, and may choose to color files by location, depth, name or extension (exploratory, multivariate).
The giant orange box in the top left was 15GB of pure junk -- apparently a CGI-script generating some page I was screenscraping went crazy and sent me 15GB of junk data, the same line repeated almost billions of times. I had /no/ idea it was sitting there. That dataset was supposed to be huge, so I had never drilled into the directory beyond my standard du -sc | sort -n
on the containing directory. The picture, however, showed at a glance what a table of numbers dramatically failed to do: that the directory consumed twice as much as it should. The simple metaphor
of diskspace=area and the whole-disk view
(explanatory, documentary) - highlighted something important I'd never noticed.
The giant cluster in the bottom right corner is a huge (~51GB) collection of video ephemera I only kinda cared about. I planned, someday, to sort them -- but for that effort and 51GB usage, it was clearly not worth it. By enforcing comparisons
, the data display made me reconsider the value vs. resource consumption of that project and make a more sound decision.
In all, I freed up almost 100GB and put a few bucks in his tip jar. Joe Bob says Check it out
. (Similar programs
exist for Linux (Baobab) and Windows (WinDirStat) too.)
Labels: apps, clustering, data, disk usage, free, infographics, mac, mipmap, open, osx, usage, useful, utilities, visualization