Statistics and political interference: To trust or not to trust the data?

A witty statesman once said, you might prove anything by figures.

This is how English essayist Thomas Carlyle begins the second chapter in his famous tract On Chartism titled “Statistics”.

There are innumerable circumstances; and one circumstance left out may be the vital one on which all turned. Statistics is a science which ought to be honourable, the basis of many most important sciences; but it is not to be carried on by steam, this science, any more than others are; a wise head is requisite for carrying it on.

– Thomas Carlyle, On Chartism

In a joint statement on Tuesday, 108 economists and social scientists around the world called out the blatant “political interference” in statistical data in India and the tendency “to suppress uncomfortable data”, calling for the restoration of “institutional independence” and integrity of public statistical organisations.

Their appeal comes against the backdrop of controversy over revision of gross domestic product (GDP) back series data and withholding employment data by the NSSO.

The signatories included prominent economists and social scientists like Abhijit Banerjee, Pranab Bardhan, Jean Dreze, James Boyce, Jayati Ghosh, Amartya Lahiri, Sudha Narayanan, Ashima Sood, Jayan Jose Thomas, Vamsi Vakulabharanam and others.

A decadent science?

India’s statistical machinery, promulgated by the British and carried forward post-independence by the makers of democratic India, has enjoyed a stellar reputation for the integrity of the data it produced on a range of economic and social parameters.

Under the ruling government, people’s respect for public data has gradually dwindled, as grossly inaccurate and discrepant numbers are thrown around like breadcrumbs for electoral gains. Pro-government media reports and social media campaigns led by lawmakers themselves have been found to report false numbers, aimed at misleading people about “vikaas”, on numerous occasions.

Remember the time BJP’s official Twitter handle posted the misleading bar graphs to portray the ‘Truth of Hike in Petroleum Prices’ last September? It was pure luck that the errors became a national talking point, perhaps because the spiralling fuel prices at the time were a national concern.

This is what it looked like:

Congress dealt with this statistical misrepresentation with sarcasm. But the entire incident made the government’s irresponsible attitude towards data pretty apparent. Qrius has reportedly extensively on the several instances when implications of distorted data were direr; for example, when the caste census and jobs report were allegedly “buried”.

The latter triggered a huge controversy when two members of the National Statistical Commission resigned in January over government’s refusal to release the post-demonetisation employment data alongside other controversial decisions, including the publication of the new GDP back series before the upcoming polls. 

What happened to the periodic labour force survey?

The NSC reportedly approved the PLFS report for release in its meeting on December 5, 2018, in Kolkata, but the Ministry of Statistics and Programme Implementation has not been made public yet.

Sources in the NSC indicated that the delay in publishing the household survey results could also be because the government is uncomfortable with its findings. This means the report contains damning evidence that demonetisation may have adversely affected the job market.

“The report was approved and should have been released immediately but wasn’t. I thought I should not watch silently what was happening,” Mohanan told Business Standard after stepping down.

Business Standard published it later and it didn’t reflect the government’s employment initiatives in a positive light. In fact, it depicted the highest growth in unemployment in 45 years. Read a detailed analysis of the current status and future of work in India here.

But where does erroneous data come from? And what happened to the days of responsible and reliable statistics? More importantly, how can one distinguish between the two, to make informed political decisions?

Take the caste census, for example

The Opposition has accused the Centre of indefinitely postponing the publication of the first Socio-Economic Caste Census in eight decades; the UPA government had conducted it in 2011 and submitted reports in 2015.

The latest census data, however, remains unavailable, although RTI activists have written to the Centre, time and again over the last three years, to publish the findings.

On January 3, 2018, the office of the Registrar General of India said an expert group would analyse the data to classify the names of castes returned in the survey. It ordered the formation of such a group in July 2015; it is, however, yet to be set up.

There has been much criticism over methods of data collection and databasing of the 46 lakh castes, sub-castes, synonyms, surnames, clans, and gothras.

The objective of the latest census was to get a picture of the caste structure and work out targeted welfare scheme for the relevant groups. Experts have claimed that the census has thrown up astronomical data—a higher OBC population figure—that could lead to demands for a higher quota in government and jobs.

This could jeopardise the entire rationale behind introducing an additional upper-class quota for economically backward students and job seekers in the ‘general’ category.

The GDP back series controversy

The Centre had kicked up a major row last year when it claimed that India’s economy grew by 8.5% in 2010-2011, and not by 10.3% as thought earlier, thus bringing the GDP accrued during UPA’s tenure down by nearly 2%.

The Central Statistics Office computed the new data and released it through NITI Aayog, after the Centre dismissed the earlier GDP data the National Statistical Commission had put out; the latter showed that the economy had grown faster during the previous government.

A new back series was called for, because old data was deemed incomparable to that of the later years, although the Centre’s move to change the base year from 2004-2005 to 2011-2012 after Modi came to power has been deemed as a deliberate bid to confuse the masses.

The recalibrated growth rates now reflects the average growth in GDP under the BJP government (7.35%) between 2014-2018—marginally better than that of their predecessor’s (6.7%) during 2005-2014.

Moreover, the move to bypass and supersede the NSC’s series by opting for NITI Aayog’s has not only been opposed by members of the NSSO, but also by the Reserve Bank of India. In fact, this is one of the multiple issues of debate between the central government and the central bank.

How trustworthy is NITI Aayog?

The NDA government formed the National Institution for Transforming India, also called NITI Aayog, to replace the Planning Commission.

Many economists, including former chief statistician Pronab Sen, also questioned NITI Aayog’s role in the release of the statistical exercise of CSO, which comes under the Ministry of Statistics and Programme Implementation (MoSPI).

Asserting the superiority of coverage and methodology used to compile the second back series, NITI Aayog said in a statement, “Country’s leading statistical experts have checked the back series CSO released today for its methodological soundness.” 

Finance Minister Arun Jaitley also defended the findings, saying, “The series have been revised based on the applicability of the data. The formula remains the same. Based on the same yardstick, the earlier years of the UPA have been revised downwards. So you gain in some years, you lose in some. Data is realistic, it is not fictional. So what was welcomed by the UPA in 2015 is now criticised in 2018 because it got revised downwards. CSO is a completely credible organisation and it maintains an arm’s length from the finance ministry.”

The growing redundancy of NSSO under NDA

The Department of Statistics came into being as a part of the Ministry of Planning in 1973. In 1999, it was merged with the Department of Programme Implementation to create an independent MoSPI.

The NSC was set up in 2006, replacing the governing council, with Professor Suresh D Tendulkar as its first chairman. An autonomous body to assess policies and approve reports developed by the NSSO, the commission comprises seven members, although it has been running at half strength for quite sometime; according to sources, the government is moving very slowly on appointing new members.

Does this point to a larger problem?

The academic sector at present grapples with a unique problem as teachers and researchers take to publish the increasing number of fake and/or substandard journals for grants, promotion, or employment. In 2016, the University Grants Commission (UGC) set up a committee to prepare an exhaustive list of legitimate academic journals across all disciplines and also recalibrated the metrics used for evaluation to tackle this problem.

This is important because the thinktank sector also, perhaps, needs a list of this sort, especially for private funded studies. The Quint notes, “Given the diversity of think tanks, it is unclear to those outside (and even some inside) the policymaking community what these institutions are, who funds them, how they influence policy, and how effective or ineffective they are.”

According to the book, Strengthening Policy Research: Role of Think Tank Initiative in South Asia, the Centre for Policy Research in India has secured large private sector funds at the cost of having a large part of the money tied to specific projects. This defies the very role of think tanks, which is to stimulate alternative policies because dominant, politically-driven policies ignore certain communities and realities.

But if they have to depend on the government (as in the case of NITI Aayog) or large private donors (Reliance-run Observer Research Foundation), certain policy issues and perspectives will inevitably be excluded from the statistical exercise, which would automatically skew the data.

In that case, why do we need data at all?

If we are to assess and quantify a government’s performance, we need to back it up with facts and statistics, especially when that government comes to power on the plank of job creation and employment generation, and lies blatantly about meeting the target.

Contrary to the promise and expectations of 20 million jobs in 2014, the past four years have seen a record decline in employment creation, as evidenced in the leaked jobs report. India’s unemployment rate, according to it, attained a 45-year high of 6.1% during 2017-2018, as opposed to 2.2% under the UPA government in 2011-12.

The report further shows that joblessness was higher in urban areas (18.7% for men and 27.2% for women) than in rural parts, with more people withdrawing from the workforce. Another alarming figure showed that more among the educated were jobless in 2017-18 than they were in 2004-05.

According to the International Labour Organisation, the number of unemployed people will rise to 18.9 million this year, from 18.6 million in 2018. About 25 million people applied for 90,000 railways jobs, while 3.7 million applied for 12,000 jobs with the Gujarat government last year. Communities like the Patels, Marathas, and Jats have staged whole demonstrations around the lack of jobs. Yet, the government denies its failure on this front and seeks re-election for ushering “development”.

Misinformation campaigns are not new to BJP’s campaign strategies but the role of manufactured statistics for political gains is often underrated. Data adds heft to their unbelievable claims and stray numbers become difficult for laymen to assess or investigate. In an interview to the right-wing magazine Swarajya, the Prime Minister himself said, “More than a lack of jobs, the issue is a lack of data on jobs.” And yet, his government inflated e-governance data to boost its Digital India initiative.

So what do we do with the data at hand?

In the absence of reliable data, question everything.

Look for sources, methodology, sample space, demographic features of those surveyed, and for vested interests.

As journalists, reports relying heavily on data must carry citations, sources, and other relevant details. More importantly, it should acknowledge the limitations, if any, and cross-check the data for “other” perspectives/exceptions. This is a good example of statistical analysis based on a limited data set.

It is also wise to approach numbers with complete incredulity—know that mathematical data is a generalisation of the issue after all, and that data collection is almost always flawed or incomplete.

For consumers of news, or prospective voters, it is vital to question the organisations, political parties, and think tanks that commission, sponsor, and conduct surveys, and the credibility of the institutes and analysts who tabulate the report.


Prarthana Mitra is a staff writer at Qrius

Datapolitical interferencestatistics