Saturday, June 24, 2017

Life in Piecewise Linear Form/ACM Awards

I have been physically handicapped the past several months, but managed to make it with some difficulty and pain to the 50th celebration of ACM Turing Award.

I liked the deep learning panel pitting the thinkers -- Stuart Russell, Michael Jordan and others  -- against/with the doers like Ilya Sutskever. The panel had some zingers like Stuart's reference to the Graduate Student Descent method. The panel more or less agreed on the strengths of deep learning --- power of learning circuits as concepts, throwing lot of computing power etc -- and their challenges -- thinking, reasoning. Judea kept pushing the panel to look at limitations of deep learning, "if I added another layer, I still can not do.... what?"

Joan Feigenbaum gave a powerful introduction to privacy vs security conundrum and had a superb panel that debated the issues in a nuanced way Crypto and Security people tend to do when confronted with the intersection of their work with social, policy and human issues. The panel was chock full of examples of "security holes" that are mainly human failings.

These meetings give me a chance to see people of course, but also see several lines of research and how far they have stretched, they need to stretch.

PS: It was great to have Moni and Amos get the Kanellakis Prize and the Diff Privacy folks get the Godel Prize.


Heavy Hitters in Ads/Analytics

I have been revisiting heavy hitters but in high dimensional contexts. See the ad exchanger note on joint work with Adobe researchers, on use of graphical model sketches.

Friday, April 28, 2017

Self III

Continuing the theme of Self II:

Yesterday, as I left my building, I ran into an uniformed NY sanitation worker who was rifling through the garbage from my building. I asked him, "Is something wrong?"  He looks up from the black garbage and asks, "Are you the Super?". This is the most natural mapping of (Me -> Role) among strangers in  Tribeca. And when I say, No, he asks incredulously, "Do you live here?". And I was wearing my best clothes. :)

Tuesday, April 25, 2017

Algorithm and Activism

There is a lot more to algorithms and activism, but for now, take a listen to this page of Adela Wagner and her simple but powerful work.

Friday, April 14, 2017

Data Science and Game Theory Workshop


Third Workshop on Algorithmic Game Theory and Data Science, to be held June 26, 2017 in Cambridge, Massachusetts.

From the call:

Computer systems have become the primary mediator of social and economic interactions, enabling transactions at ever-increasing scale. Mechanism design when done on a large scale needs to be a data-driven enterprise. It seeks to optimize some objective with respect to a huge underlying population that the mechanism designer does not have direct access to. Instead, the mechanism designer typically will have access to sampled behavior from that population (e.g. bid histories, or purchase decisions). This means that, on the one hand, mechanism designers will need to bring to bear data-driven methodology from statistical learning theory, econometrics, and revealed preference theory. On the other hand, strategic settings pose new challenges in data science, and approaches for learning and inference need to be adapted to account for strategization. The goal of this workshop is to frame the agenda for research at the interface of algorithms, game theory, and data science.


Wednesday, April 12, 2017

Algorithms in the Field: The PI Meeting

Algorithms in the Field is a program of Algorithms + Other CS researchers working jointly "in the field" of problems. Here is a history of development of this program. Recently I attended the PI meeting for this NSF program:
  • Nearly all the talks were interesting, genuinely targeting the middle ground. They didnt go into the rabbit holes of describing the minute technical improvements in approximation ratios or the 6 different data sets they had to cull to compare 4 different algorithms. Each speaker made an effort to cover the field area (special hardware like memristors, SDNs, wireless receivers etc) and also provide an overview of the algorithmic challenges. This was refreshing. 
  • There were some very interesting examples of the joint research: Ashish Goel spoke about societal decision making and examples such as budget decisions, viewed as collaborative convex programs; Piotr Indyk, ever-wise with his choices, spoke about the wireless transmitter/receiver setting for most part of his talk, really communicating the nature of the "field" before quickly connecting it to sparse Fourier estimation problems; Vyas Sekar spoke about formulating SDN routing policies using path constraints and posed some open problems, which Bobby Kleinberg who went just after, proceeded to solve at least partially, with his work. This shows (a) Algorithms in the Field research can happen in real time and (b) Dont speak before Bobby and pose open problems. 
ps: Sucheta Soundarajan and Martin Farach-Colton organized the meeting and did an excellent call to include graduate students/postdocs in the invite, graduate students/postdocs being the glue between professors and conduits for communication. 


Documenta 12--14

I think there is a quote that, "I have never seen a good Rothko, but I have seen a great roomful of Rothkos". I dont know who said it, but I am going to attribute it to Assaf Naor who reminded me of this phenomenon a while ago. Anyway, DC has a roomful (or two) of Rothkos.

Likewise, Documenta is an art show (Not Art Basel, but Art Kassel!), it may be good by itself, but great seen in mental juxtaposition. Here are the New York Times reviews of  Documenta 12 from 2007, 13 from 2012 and 14 from 2017 which has just started. 

Monday, March 20, 2017

Self II

Here is how the world sees me.
  • There is a coffee place in SFO, tucked away where people dont make it often. I pick up coffee there when I fly in from EWR, and they give me a discount because "you are a taxi driver". 
  • I went to a WeWork location in NY city to meet a friend, and as I walked in, the receptionist said, "talk to that person, she knows who needs the handyman in the building", and I got to go through the process of a fixer-upper to enter the building. 

Self I

People seem to like it when I poke at myself:

In a recent conversation, we discussed Dad Jeans, described precisely here, but more as a state of mind. A parent needs to think several steps ahead on behalf of their kid who can swerve from disegaged to insightful, be prepared for spills, and be prepared to be out the door the instant the kids are ready unexpectedly for the playground in the winter. So, Dad Jeans, is the choice of wear, it communicates that you are unable to be anything or be anywhere else, beyond your control.

Being me, I have to find my own way to express that state of mind, so these days I am doing Dad Hair, baggy, ready to follow me instantly, and unable to be anything else. :)

Monday, March 06, 2017

CS Divisions

Thanks to a recommendation from Marc Donner from old google days who now runs Uber, NYC, I am reading Sapiens by Yuval Harari. The author tries to explain the history of humans, succinctly, and succeeds by having an insightful view of anthropology, sociology, behavioral theory, and of course, science and religion too.  One of the interesting parts for me was the need humans felt to divide people into categories (think commoner/noble, castes, etc).  Alas, with division into categories, comes an imposed order among them and fights to invert the order. The author argues that this imagined order among humans keeps societies stable when it works, and unstable when it doesnt.

I have always been suspicious of divisions. In CS, folks divide areas of research. These are not islands.  In any area of research (say AI, social networks, Robotics, Brain, whatever), there are (a) theoretical foundations and optimizations, (b) new systems research into hardware and software needed to program them, compile into executables, execute them efficiently, (c) new data and UI systems to use, analyze, report, mine and troubleshoot, and so on. A great research will include conceptual breakthroughs, cacophony of math symbols no more than what is needed, potential for pretty plots, and a storyline for NY Times for societal impact. Most individuals' research doesnt hit on all these metrics, doesnt have to, we rely on the cumulation of research to hit all of the metrics. Any research area will be potentially less engaging without ALL of these elements, no order amongst them is needed. 

Extreme Streaming

I am making my way back into researching streaming problems.

One of the directions I am focusing on: how to use not polylog memory as is standard in streaming algorithms, but even smaller, say O(1) memory.  My coauthors and I have such algorithms for estimating the H-index on streams (to appear in PODS 2017, will be on arxiv soon) and estimating heavy hitters in a stream of streams model (to appear in SDN 2017).

I was sort of pushed into this model the way I like to find problems in general. If you look at modern applications, there are some real constraints. For examples in SDNs (Software Defined Networks), there are memory pipelines that packets can percolate through, each memory stage can be thought of as a row of standard sketches, and then one needs to compute something on top of these row estimates, but use only memory that can fit into a single packet header. Another example is that streaming analyses are done for a very large number of groups (say for each source IP address or internet user) and in that case, polylog memory per group is already far too much.

I call these extreme streaming problems, inspired by Extreme Classification in Machine Learning, which studies ML problems with a very large number of labels. I think there is more to mill here.