Wednesday, 28 June 2017

Communicating using Chessboards!

Any blog such as this must necessarily have a few puzzles. Here's one towards this, which I came across in a recent talk by Paul Cuff. Solving it has been one of my more enlightening moments in the past few months (another equally enlightening moment was standing on the weighing scale to discover how much weight I'd gained).

The puzzle is as follows. There are two parties, Alice and Bob. Alice has access to a chessboard with coins placed on each square. Using this, Alice wishes to send a 6-bit message to Bob. Alice is allowed to flip one coin (among the 64 coins) and upon flipping, the chessboard is made visible to Bob. The question is then to find a strategy that Alice and Bob can agree upon so that Bob can obtain the message on looking at the board.

As a primer, consider the case that the initial configuration (heads up or tails up) of the coins on the board is already known to Bob. A simple strategy here is to map the 6-bit message set to one of 2^6=64 squares, and Alice flips the coin on the square corresponding to the message she wishes to send. Bob then looks at the board, figures out which square has the flipped coin and viola, decodes the message. Sounds too trivial? Read on.

Now suppose that the initial configuration of the coins on the board is not known to Bob. Can Alice still send a 6-bit message? Sounds impossible? Turns out it is possible! Find a strategy for this!

A disclaimer: This problem might take some time to solve. However, the reward/enlightenment on solving it far outweighs the effort involved in doing so. As some of you may have noticed, this problem is notoriously similar to "Dirty Paper Coding".

Find the solution in the comments, solved by one of the legendary puzzle enthusiasts of UCSD, Anant Dhayal. So as to prevent any accidental viewing of the comments, I've filled some space with the famous poem by Robert Frost (which I've incidentally alluded to here).

The Road Not Taken

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

Friday, 17 February 2017

How to Make a Blog Popular?

In order to keep up with the expectations of the viewers (which I actually need not since it's mostly a null set), I've desisted from making any non-technical posts about the adventure that settling in at San Diego has been. As to the delay in posting here, suffice it to say that the clear sky, the beaches and the mild temperatures make for a very relaxed start to things. Now that it's been a few months, perhaps it's time to introduce my first research problem.

This problem has central motivations in increasing the popularity of this blog itself! Now, what makes a blog post? At the very basic level, one typically talks about 2 or more topics and in an elegant/novel fashion draws connections between them. However, as the recent elections have proven, popularity is driven more by the topics being of interest to viewers rather than the elegance itself (think Donald Trump).

Thus, one natural strategy is to first form a data-driven estimate of the set of people that are interested in various topics (say we have K different topics in total). Then, to use these estimates to make decisions on which topic to write a post on.

How do I go about doing this? Since I'd like to be a theoretician for the moment, at least until the point that I go looking for jobs, I'll simply do what every other theoretician does, assume that I already have these estimates. An interesting question now arises. To understand this, think of the following case first: If blog posts are to consist of only one topic, the optimal strategy is to simply choose the most popular topic.

However, say if two topics are to be included in the post, picking the first and the second-ranked topics might be sub-optimal. Moreover, this sub-optimality arises more often than not. Consider the case that the first rank goes to the topic "Kim Kardashian". Given this, it's entirely reasonable for the next rank to go to "Kanye West". Then, including these two topics might involve a large number of viewers which are in common to the two, thus disallowing us to leverage the presence of the second topic. One might, for instance, be better off including two very diverse topics such as "Kim Kardashian" and "Information Theory" to maximise the number of interested viewers.

More formally, we can think of each topic as corresponding to a set of users. Thus, we're interested in the two sets out of K, that when taken together gives the largest union. This problem has been widely studied under the title of "Data-Summarization". In one famous result, it was shown that an algorithm which picks one topic to be the first-ranked topic and the other to be that which creates the largest union with the former, gives us a solution that's within a factor of (1-1/e) of the optimal value. It was also shown that aspiring for anything more than this factor makes it an NP-hard problem.

However, if we impose additional constraints on our sets of interest, there's hope to find the optimal solution in complexity that's polynomial in K. We have promising results on some toy problems in this direction. But it remains to be seen if significant results can be obtained. Stay tuned for an update on either the toy problem if there's no success/a link to a paper if we do succeed.