Computer Science and Empiricism

In a talk given at UW today, Alfred Spector, Google’s VP of Research and Special Initiatives, made a point that I hadn’t thought about before. He made the statement that computer science is much more empirical today than it was when he was a graduate student 30 years ago.

I’m currently reading Logicomix, a graphic novel with a twofold role as a biography of Bertrand Russell and a concise history of the quest for mathematical certainty that took place in the early 20th century. Reading it alongside a courseload of discrete math and theory of computation classes has me knee-deep in the mathematical foundations of computer science. While these foundations are valid and necessary for a historical appreciation of the field, they don’t always lend themselves well to contemporary issues faced in industry and academia.

Spector’s talk reinforced the image of Google as grand archiver and distributor of the consolidated sum of human knowledge, a role not always well-served by a traditional approach. In recent years, we’ve seen a rise in parallel and probabilistic approaches to emerging problems: MapReduce/Hadoop, machine learning, Bayesian this, Markov that. With the astronomical amounts of data that companies like Google have to deal with, the door is opened for statistical methods.

Computer science deals with more measurable quantities than it used to. For example, as networks and systems grow larger, small margins of error or delay become more readily measurable. When the entire world is your testbed, you have to measure and test every aspect of the systems you build. In this sense, CS is increasingly becoming a more empirical science.

My only hope is that this was more often emphasized in classes, instead of by visiting guest lecturers.

Geolocation API for distributed computing research

Last quarter, I quit my web development job at the UW Clinical Trial Center in order to pursue research within UW’s CSE department. As a startup project for a distributed computing research project called Seattle, I put together a simple geolocation library that uses a Python library called pygeoip to look up location data for hostnames and IP addresses.

The first step was to set … Read the rest

Calculating server time with a Javascript date object

A few weeks ago at work I was assigned to develop a warning banner to be displayed across our intranet data entry site to warn users of impending code launches. Our users tend to leave pages open for a while as they enter data, so we wanted the banner to display dynamically. For example, if someone had a page open during a specified warning time, … Read the rest

Visual feedback and saving inventions on Eureka

I just launched a couple new features for Eureka. First is an animated loading bar after you click the button to generate an invention. Very minor, but I thought some visual feedback was necessary instead of having the user stare at an unchanging screen while the Markov processor runs.

Second, you can now save and revisit inventions. As suggested by commentors,  generated-invention pages now have … Read the rest

Introducing Eureka Invention Generator

I just submitted a little web app I’ve been working on this summer for Sunlight Labs’ Apps for America 2 programming challenge. The idea of the contest is to build an app around one of the datasets provided by our friendly US government on the new Data.gov website. My app is called Eureka and it generates inventions.

The app is built around a Markov processor … Read the rest

Next entries »