Here’s a new technique for policy-makers that can help clear the air of data miasma to some extent, tell us where we stand, and help in government decision-making.
A few things have finally become clear, two-and-a-half months after the Wuhan virus pushed India into a lockdown:
Yet with all that hanging over our heads, the question remains: how is the disease spreading, and how is it being contained?
One option is to try and analyse the available epidemic data empirically, so that virus transmission, and efforts to control its spread, can be correlated in spite of shortcomings in data quality and quantity.
In this article, we shall demonstrate how a cross-plot of positivity and testing generates representative trends, from which some inferences can be made. These in turn help clear the air of data miasma to some extent, tell us where we stand, and point at some of the decisions governments might need to take soon.
In this instance, data from Tamil Nadu (TN), Maharashtra (MH), Gujarat (GJ) and Delhi (DL) have been used since they account for the bulk of the daily case numbers.
Andhra Pradesh (AP) and Kerala (KL) have been added as controls; AP, because it is the sole state to have successfully chased the virus thus far, in spite of heavy odds; and KL, because certain circles of commentary view the communist model of containment there as the ideal.
A status update first:
As on the start of Lockdown 5.0, GJ is the sole state to enter into a taper. It has started to cut across TN and DL. But as we shall see below, wild fluctuations in testing persist, along with apprehensions that case counts could thus rise temporarily.
The other five states continue to follow a broad, three-week growth exponent range, with recent, sharper trends in TN and DL. This includes AP, which has unfortunately sprouted fresh clusters in some districts.
However, as the next chart below shows, these rising trends are also in part, a function of enhanced testing capacities, with more samples throwing up more case numbers.
This is an important point: analysts will have to make the distinction between epidemic spread on the one hand, and the testing out of individuals in identified, largely-contained clusters, on the other. This would have to be done on a case-to-case basis. And one way of doing that is by generating a relationship between cases and testing – the positivity versus tests plot:
By this correlation, we seek to link positivity (taken as a quantification of the epidemic’s extent in a region) with a validation of whether what we are gauging is representative or not (which is what tests per million is: a measure of how much of the population has been screened).
In a sense then, this is a control test, or a cross check, which enhances confidence levels of assumptions.
For example, if from the above plot we see that the figure of tests per million is visibly inadequate (which is why we have AP as a yardstick), or if the positivity factor is not declining in spite of a high tests-per-million count, then that is a point which administrators would need to flag, as they consider further options.
The interpretations are straightforward: a consistently declining slope means that the epidemic is being successfully contained.
On the other hand, a rising slope means insufficient testing of an infected populace larger than currently defined.
The key distinction between the two is made upon the steepness of the rising slope: a gentle rise means that enhanced testing will soon map out the clusters. But a steeply-rising slope means that testing is lagging well behind infections and/or transmission.
Now look at the horizontal X-axis in Chart 3 above. Positivity could just as well have been plotted against time, as it usually is. But here, the unit employed is tests per million, as an increasing cumulative aggregate.
As a result, we are now able to create a link between a measure of cases and a measure of testing.
The resultant curves display trend breaks far more starkly; and, because the relationship is dimensionless, it also offers a welcome respite from selective release of data by governments on selected days for ‘selected reasons’.
One can thus compare the relative extent of the infection, against the relative extent of screening within a domain, with less data stagger.
If positivity is high when the curve is far to the left on the X-axis, then the problem is grave; if positivity is low, and the curve extends well to the right (say, 7000 to 10,000, or beyond), then the situation is well under control and headed for abatement.
If the curve is somewhere in between, then the administration of that province learns two things: where they might be headed if they don’t act decisively, and what they need to do to keep matters in check (like enhancing testing capacity, or expanding ambit of scrutiny).
Equally importantly, this empirical analysis offers indicators of how much a state should be testing at, thereby giving policy-makers a rough quantification of what testing targets to aim at (and by corollary, another broad indicator of how many hospital beds, or ventilators, or other public health facilities need to be created).
Ideally, administrators would be well-advised to generate similar correlations at the micro level (which data is unfortunately not often available in the public domain) of say a ward or a postal district. Superimposed on GIS-amenable geographic and demographic parameters, and studied in tandem with other epidemic monitoring plots, this diagnostic tool would offer further insights into better defining a truly confounding problem.
What must, however, not be forgotten is the inherent subjectivity of the interpretations, because one variable playing on the correlations is an intangible – the administrative efficiency of isolating containment zones.
This is since, it could also well be, that there is seepage of infection across porous boundaries for various human reasons, which shows up on the plot as a steeply-rising curve. But if district administrations are largely successful in enforcing lockdown measures, then both trends and interpretations will hold.
Caveats notwithstanding, we may then infer the following, prima facie:
Gujarat: Ahmedabad, where most of the cases are, and where much of the testing is being done, is on the cusp. If in the first instance, testing can be enhanced in a sustained manner to above 6,000 samples a day, for the next week, positivity (and cases) would in high probability go up temporarily, before initially plateauing, and then mercifully declining.
Delhi: The capital’s curve is rising steeply. Strong fluctuations are also noted on the daily testing rates, meaning that a further enhancement of testing facilities is advised – to at least above 8,000 per day on a sustained basis, in the interim. Considering that it is already testing the most per million population, an inference is that further stress may simultaneously have to be placed on strictly controlling transmission as well.
Tamil Nadu: While the number of cases in the state has risen to over a thousand a day – mostly in Chennai – the positivity is still low enough to swiftly benefit from enhanced testing. This would become evident if the sample count was raised closer to 15,000 per day.
Andhra Pradesh: It seems almost unfair that the one state which managed to beat positivity down to under 1 per cent, is now marked by a minor rising trend once more. Sadly, this is the way of the Wuhan virus. Still, if the state administration can raise and hold their daily sample count to 12,000 in a sustained manner, matters could once again be brought under control soon. (A constraint which must be recognised in AP is the proliferation of fresh clusters in hitherto unmarked districts like Srikakulam, and a resurgence in the two Godavari delta districts. This geographical spread would result in a far-flung distribution of scarce resources – a problem not faced by the big cities)
Kerala: The primary inference is that testing levels in the state are woefully inadequate. For a government which began the battle against the Wuhan virus first, the capacity ramp up has been far from satisfactory. This is exacerbated by a patently disquieting phenomenon unique to that region – the frequency of random case discoveries. One was uncovered when a coconut fell on a man, necessitating hospitalisation.
Another, for example, was when a man slipped across the border to TN, imbibed too much alcohol, and had to be taken to the hospital. Worse, an influx of returnees is on, and land border screenings are restricted by insufficient testing teams to simple thermal scans; this means that asymptomatic carriers pass into the state undetected.
Further, cases are on the rise again, so is positivity, and the public has on occasion been witness to the oddity of more hotspots being declared on one day than cases of contact transmission. And to top it all, the state government’s data is riddled with inconsistencies. All of this brings into serious question, the state government’s declaration of having flattened the curve, and makes even a positivity versus tests-per-million correlation potentially quite non-representative.
Maharashtra: India’s richest state, with the most cases, and the most at stake, merits specific detailing. For this, both positivity and daily cases have additionally been plotted against cumulative cases in an extra plot below. Note that positivity is plotted on the left Y-axis in Cartesian scale, and daily cases on the right in logarithmic scale.
From this, we see that a ramping up of testing capacity has actually slowed down the positivity curve’s ascent this week (black line). It is therefore entirely possible, that if the state can maintain a testing rate of over 16,000 samples a day, for the coming crucial week, the spread of the virus can be successfully mapped out and contained. This is consistent with a hint of a plateau beginning to manifest itself on the purple daily cases curve.
In conclusion, to summarise, a cross-plot of positivity versus tests per million offers administrators a measure of how far matters are under control in their jurisdictions. It also indicates how much more efforts and monies they may need to spend, to get ahead of the virus in a particular locale. And while it is no substitute for a simulation, or a precise forecast drawn from quality data, it is as good a supplementary route-marker for decision making, as may be devised in these amorphous, uncharted circumstances.
All data from Covid19india.org and MoHFW