Saturday 27 December 2008

From Bombay to Baramulla



Here is an argument I heard a lot of six months ago, which, since 26/11, seems so facile that it has vanished from the debate. It is worth noting how dangerous this argument is right now, because this argument will find plausible new clothes and re-appear in six months, or a year, or in six years.

The argument goes roughly as follows:

Kashmir is not worth the bother. Let it go. Give it to Pakistan. Or give it independence. Once the vast resources invested in Kashmir are freed up, India can carry on realizing its manifest destiny as a great nation that all of humanity looks to for moral, spiritual, technological and economic leadership.

This argument was well expressed by Vir Sanghvi, in this piece in the Hindustan Times in Aug 2008. Vir Sanghvi is the Editorial Director of the Hindustan Times, the former editor of Sunday magazine, a fairly mainstream journalistic voice I've agreed with many times in the past.

What 26/11 made painfully obvious is the naivety of this notion: that a surgical excision of Kashmir from India would result in a quid pro quo reduction in violence on this side of the border. This notion now looks as shallow as the conceit, back in 1947, that partition would “solve” the problem of plural identities in India.

India’s federal, democratic structure is not perfect, but it is designed well enough to accommodate the many distinct identities within India. Vir Sanghvi correctly points out that secession movements inspired by language, race and religion have been successfully accommodated within India multiple times, in places like Tamil Nadu, Mizoram and Punjab.

The reason the federal, secular, democratic framework of the Indian constitution does not work for Kashmir is that a scary number of the people who claim to speak for Kashmir are not Kashmiris, and don’t especially care about the political expression of a Kashmiri identity. They are international jihadists. Palestine, Chehnya, Kashmir, Iraq, Afghanistan, American tanks on sacred Saudi soil, Western decadence, apostate regimes in the Islamic world, insults to Islam in Danish newspapers – any grievance is grist to their mill. Understanding who these jihadist terrorists are and what these terrorists are trying to achieve is essential to understanding Kashmir in context, and to getting a sense for what it might mean to surgically excise Kashmir from India.

Terrorism is not especially Islamic. The personal psycho-drama that happens within a terrorist is neither mysterious, nor Islamic, nor the by-product of failed societies. It is commonplace. David Kilcullen, a brilliant Australian anthropologist who first learnt about terror as a soldier serving in Indonesia, gets his students to watch the film Fight Club to understand a terrorist’s psychology.

In essence, contemporary jihad, like all terrorism, is a rational political strategy. It was invented as a modern political strategy in 1946, when David Ben Gurion authorized the bombing of the Hotel King David in the then British Protectorate of Palestine. The consequence of this bombing was that Clement Atlee expedited the withdrawal of British forces from Palestine, thereby establishing the sovereign state of Israel. Wikipedia maintains that a former Prime Minister of Israel, Benjamin Netanyahu, took part in celebrations to mark the 60th anniversary of this attack, organized by the Menachem Begin center. The most ruthless terrorists in the world today are probably Hindu Tamils in Sri Lanka. The consequence of their ruthlessness, especially in murdering dissenting Tamils, has been that theirs is the only audible voice that claims to speak for Lanka’s Tamil people. The PKK, led by Abdullah Ocalan, killed 40,000 innocent Turks in the name of Kurdistan, winning the sympathy of bleeding heart liberals in Western Europe as a "people without a nation". Danielle Mitterand, the French President's wife was an public supporter of Ocalan, and pleaded for clemency in sentencing when Ocalan was captured by the Turkish army. The hijack of IC 814 from Kathmandu to Kandahar in 1999 led to the release of Masood Azhar, a Jaish e Mohammad operative believed to the involved in the attacks on Mumbai. Another of the IC 814 terrorists released, Omar Sheikh, was involved in the murder of WSJ journalist Daniel Pearl.

The most successful terrorist attack of all time, purely in terms of the political pay-off, must be 911. It resulted in the Al Queda being addressed by the world’s only superpower like it was a force of equal stature. Their preferred tactic, suicide bombings to murder non-combatants, is now dignified by the term War on Terror. Before the War on Terror, the Al Queda was a toxic but small fragment left over from the Soviet war in Afghanistan, with no easily identifiable outcome it was working towards. It felt, or could have been made to feel, like an anachronism. It might have struggled to capture the imagination of young people; it might have struggled to stay alive another generation. That is no longer a problem, not for the Al Queda.

It is now crystal clear that there is no morally justified use of terror, exactly like there is no morally justified use of genocide. It is also clear that terrorism is growing alarmingly because terror works. The only way in which the world-of-order can defeat terror is to make it a strategy that does not work.

Terrorists cannot be engaged and defeated in open battle. They need to be starved. They need to be deprived of the oxygen of media exposure, of new recruits, of arms, of money. Most importantly, they need to be deprived of the sweet smell of success.

The world-of-order needs to realize that this victory will be won slowly. This victory will involve intelligence, media management, paranoia, nonchalance, ruthlessness, mercy, narcotics control, anti-money-laundering operations, inter-national co-operation...lots of stuff. There will be no spectacular television-friendly signing of treaties that constitute a “solution”.

This is the context in which the Kashmir situation must be understood. India’s political class has got this one right. Kashmir needs to stay within the Indian Union, with as humane a police/ military presence as is possible. Not because of jingoistic nationalism. But because throwing a hunk of juicy red meat to the beast of international terrorism, breathing energy and life into the beast, is the most dangerous and irresponsible thing any civilized nation could do, today, or at any time in the foreseeable future.

Monday 22 December 2008

Dravid's slump in form



I am on vacation in India. One of my nephews and I are vegging out in front of the TV on a Monday afternoon at my in-laws place, while the rest of the family naps. We are watching India nurdle along at 2 runs per over on the fourth afternoon of the Mohali test against England.

The commentators don't have a whole lot to talk about. We are watching endless replays of Dravid's stumps being shattered by Stuart Broad. Are horrible pictures like this a sign of Rahul Dravid's decline? Or does this just happen sometimes to any batsman, however great? And what to read into his century in the first innings of the Mohali test? The commentators are blathering on and on...for long enough for my inner-analyst to want to get beyond the balther...

The commentariat all agree that Dravid is suffering a slump in form. What, unfortunately, has not been properly examined is whether Dravid has really been scoring fewer runs than before, or whether the perceived slump in form is nothing more than randomness playing out. It is entirely possible that Dravid is batting as well as he ever has, and that the dice just haven't rolled his way. The mind is very good at spotting patterns, especially when there aren't any.

This question is inspired by Moneyball (recommended reading for any cricket fan). Moneyball is about how statistical analysis forms the foundation of a winning baseball team, the Oakland Athletics. It reports on persistent sporting myths that statistics busts. For instance, there is no such thing as a clutch hitter, a batter who does especially well in vital situations. Or that there is no such thing as a hot hand, a streak in basketball when a NBA player is "in-the-groove" and landing every shot in the basket.

Baseball is now enriched by a Society for American Baseball Research. The American Statistical Association now has a section dedicated to sports statistics. It is a pity that this quality of statistical analysis has not been applied to cricket, despite the richness of the data available. It is also an opportunity for a smart young cricket-loving statisticians. Calling S. Rajesh of Cricinfo?

To give the interested (geeky) reader a flavour of what is possible, here is the outline of a statistical analysis that would shed more light on Dravid's form than anything that has appeared in the media so far. None of the technique described below is very complicated, or goes beyond material taught routinely at the undergraduate level. I would be delighted to see this analysis available in the public domain along with a well documented methodology and explanations, and expect no credit or authorship rights. Also, a disclaimer. I am not a professional statistician; my knowledge of statistics is mainly as a customer to statisticians. Any feedback from readers with more statistical knowledge, especially around time-series analytic techniques could improve this analysis, is appreciated.

Outline of desired analysis
Step 1: compile the dataset
Each record in the dataset is one of the ~25000 balls Rahul Dravid has faced in test cricket. Each record in the dataset has the following fields: outcome (which takes the values 0-6 and W, all represented as class variables), opponent (Australia, England etc.), bowler, bowler type (pace, military medium, leg spin etc.), location, home away flag (derived from location), innings (which-ith innings of a test match), position played in the batting order (mostly #3), number of balls already faced in the innings, date innings started, a random number (for validation in step 5).

I don't think any of this data is hard to obtain. It is reported in the ball by ball commentary on Cricinfo, which I'm assuming is professionally archived. This list is not meant to be exhaustive, most datasets come with a few plausible covariates that can be thrown in and played with.

A couple of fields I would love to add, which may be harder to obtain, are length (full, length, back-of, short) and line (outside off, off, middle and leg, outside leg). I believe this is the data the team statisticians sitting in the dressing room code in their laptops.

Examine the dataset to get familiar with patterns, especially with potentially tricky variables like bowler or location. For instance, a bowler Dravid has faced for 12 balls may have dismissed Dravid twice. Worth being aware of weird things in the data before running any regressions.

Step 2: run the regression model
Model the outcome, number of runs scored, wicket or dot-ball as a multinomial logistic outcome. This class of models are used in transportation analysis - every commuter has the choice of multiple models of transportation - or in brand analysis - every consumer has the choice of multiple brands of breakfast cereal. Similarly, every ball has the choice of different outcomes - from sixer through dot-ball to wicket.

Allow the model to see all the fields listed above. Do not constrain the model. All two-term interactions. Just maximize the fit. Essentially, the computer is finding the configuration of explanatory variables with maximizes the likelihood of the observing the outcomes in the dataset.

Most modern statistical packages will apply simple transforms to covariates to improve fit, like for instance taking log(number of balls already faced), a transform which makes intuitive sense anyway.

Step 3: read the results
First pass, one is expecting to see date of innings being a statistically significant. If it is clearly significant, and the coefficient has the right sign (a decline in form), that probably means the effect is real. A completely unconstrained model might spit out some funky functional forms, with performance being a parabolic function of time...improving initially and then declining.

A bunch of other interesting effects will be visible at this stage, and are fun to look for. For instance, does Dravid have a nemesis bowler? Is Dravid genuinely as good abroad as he is at home? Has he done any worse as an opener than at #3? Is Dravid more vulnerable to full length deliveries on the slow pitches at home than abroad (does the interaction term between home away flag and length have a non-zero coefficient)?

Step 4: tweak the model
Refinements to the model are usually needed at this stage.

For instance, if no effect is observed overall, it might be because a real effect over the last six months may be hidden by the length of the continuous dataset in use. Converting time into six monthly blocks may be useful.

Also, a time effect might be masked because it is correlated with the opposition. It might look like Dravid just happens to be weaker against Sri Lanka and Australia, India's most recent opponents. In this case, one might want to force the model to accept time blocks before it admits opposition.

Bowlers with thin data might show up having implausibly strong effects. One might want to modify the data to slot all bowlers who have bowled less than 250 balls at Dravid into a pie-chuckers categorical variable.

Step 5: validate the model
Keep a random subset of ~5000 balls outside the analysis described so far. Repeat the analysis on this holdout to make sure the results observed are similar. Validating on an additional time period is probably nonsense in this context, since time is a variable of interest.

A more interesting approaches to validation is to validate on non-test match data. If Dravid is in decline, we would expect to see that in all forms of cricket.

Step 6: Document the results and limitations
Gaps in data and any subjective interpretations or analytic choices missing values/ definition of class variable etc. would be logged here.

Some limitations are systematic. This dataset is limited to Dravid's performance only. So a generalized improvement in the performance of all test batsmen of the same time period would not be picked up by the model. It is possible that Dravid is playing as well as ever, and that the world has moved forward faster than Dravid. A more ambitious analysis spanning a broader base of test batsmen is needed to shed more light on this.

Also highlight opportunities to improve on the analysis. For instance, it would be interesting to compare Dravid's decline with that of other top players. Assuming there is a decline, is it worse than what Gavaskar or Border suffered? Data may be thinner in the pre-internet era...but maybe it is out there in official score sheets.

Most critically, this analysis does not tell the captain whether or not Dravid should be replaced with a younger batsman. That remains a judgment call, based largely on how he wants to build his team. What it may tell the captain is that Dravid's run of poor scores is explained by randomness and is likely to end soon. So we avoid the injustice of a great player being judged on poorly constructed evidence.

Sunday 21 December 2008

What is right about Indian education?

My friends and family in India are in a state of perpetual despair about our education system. The system does not even attempt to develop creativity, or critical reasoning, or the love of learning, or emotional intelligence. These are things we all want. Some good friends of mine are doing superb work to try and invigourate the system. But the fact remains that the system teaches students to learn by rote, to "crack" exams.

Yet, despite this depressing unidimensional approach, one can't help but notice that the products of this flawed system generally do OK. Certainly, compared the products of other national systems, and better than the prevailing despair might suggest. Why?

Lord William Henry Beveridge, born in Rangpur, Bengal, to an ICS officer in 1879, quoted here here in the Times, may have a clue. In his report on Social Insurance and Allied Services, submitted to the Government of Winston Churchill in 1942, he notes that:

“Most men who have once gained the habit of work would rather work... than be idle... "

Extrapolating a bit, teenagers who have worked their tails off to get 97% in their board exams have surely gained the habit of hard work. Ditto for the brutally competitive entrance exams which serve as gatekeepers to most walks of Indian life. The same habit of hard work has been installed in ten times that number who slaved away for assorted entrance exams and didn't get accepted. They Indian educational system may be very good at giving kids a tough work ethic.

This may also provide a clue to another puzzle. Why do jocks, serious sportsmen, do much better than their CGPAs suggest? Maybe because they have developed a tough work ethic?

Wednesday 17 December 2008

Pyaasa



Since 26/11, it hasn’t felt appropriate to change the topic to something other than the Mumbai attacks. Sachin’s century at Chepauk has given me permission to do just that.

This test match was always about more the cricket. For England to have shown the gumption to come out and play, for Strauss and Collingwood to play gritty career defining knocks, for Sachin Tendulkar to lead India on a record breaking run-chase, bringing up his century and the winning runs with the same shot, and for Sachin to have the grace and presence of mind to dedicate his century to the victims of the Mumbai attacks...

I'm just grateful that my favourite game can produce such a moment. The best piece I came across on this test match being about more than cricket was by Peter Roebuck in Cricinfo.

Yet, as a long-suffering India cricket fan, this win is special to me in just simple cricketing terms. This tickles the same spot as watching Ishant Sharma dominate Ricky Ponting at the WACA; it slakes a thirst that has been building up for decades.

Great teams chase down big targets. Bradman’s Invincibles chased down 403 at Headingley in 1948, to secure the Ashes and their status as the Invincibles. Steve Waugh’s Aussies chased down 369 in Sydney, 1999, after being 5 for 126, against a Pakistan attack that included Wasim Akram, Waqar Younis, Shoaib Akthar and Saqlain Mushtaq. Clive Lloyd’s Windies had their own great chase when Gordon Greenidge made mockery of David Gower’s declaration at Lord's in 1984 by chasing down 342 at a rate of over 5 an over while losing just one wicket. These wins matter more than others; they take on a talismanic quality, keeping the possibility of victory alive in a team’s imagination even in the worst situation.

India, Saurav’s India, had a talismanic win in Kolkata, 2001. But this win felt different from a triumphal march through a fourth innings chase. It was built around Rahul and VVS gritting out for survival in the third innings, with Harbhajan coming in to deliver the kill. Strangely, despite being a team built around a wealth of batting talent, India’s fourth innings performances have been appalling.

Think back to the inexplicable collapse to Shahid Afridi in Bangalore, 2005, needing to bat out a day to win the series. All out for 100 to the mesmerizing spin of Shaun Udal in Mumbai, 2006, again needing to bat out a day to win the series. Or losing three wickets in five balls to Michael Clarke in Sydney, 2008.

Going back a bit, remember the collapse in Barbados, 1997, when faced with the opportunity to be the first team in two generations to dethrone the Windies at home? Or falling achingly short of the mark against Akram’s Pakis in Chennai, 1999, in an ill-tempered and tightly fought series. Even the famous tied test against the Aussies in Chennai, 1987, was a game India should have won in a canter.

India have also generally made heavy weather of small fourth innings targets, even if we did go on to win. It came down to Sameer Dighe and Harbhajan Singh to hold their nerve and chase down 155 in Chennai, 2001. Chasing 233 to win in Adelaide, 2004, was a nervy affair that could have gone either way.

Those ghosts have now been exorcised.

What else would I have wished for in this game? For Rohit Sharma, S Badrinath, Virat Kohli, Robin Uthappa, Suresh Raina, M Vijay and Shikar Dhawan to have been sitting in the dressing room, absorbing the atmosphere, drinking in the subliminal belief that this is how India bats when it really counts.