Saturday, January 30, 2010

Animal Activism and Statistical Fallacies

In a recent article, animal rights activists (Mercy for Animals-MFA) went undercover and made some observations about animal abuse on dairy farms. See-
Governor Paterson, Shut This Dairy Down

The author of the above article states:

"But the grisly footage that every farm randomly chosen for investigation--MFA has investigated 11--seems to yield, indicates the violence is not isolated, not coincidental, but agribusiness-as-usual."

This is exactly why economists and scientists employ statistical methods. Anyone can make outrageous claims about a number of policies, but are these claims really consistent with evidence? How do we determine if some claims are more valid than others?

Statistical inference is the process by which we take a sample and then try to make statements about the population based on what we observe from the sample. If we take a sample (like a sample of dairy farms) and make observations, the fact that our sample was 'random' doesn't necessarily make our conclusions about the population it came from valid.

Before we can say anything about the population, we need to know 'how rare is this sample?' We need to know something about our 'sampling distribution' to make these claims.

According to the USDA, in 2006 there were 75,000 dairy operations in the U.S. According to the activists claims, they 'randomly' sampled 11 dairies and found abuse on all of them. That represents just .0146% of all dairies. If we wanted to investigate the proportion of dairy farms that were abusing animals, if we wanted to be 90% confident in our estimate ( that is construct a 90% confidence interval) and we wanted the estimate (within the confidence interval)to be within a margin of error of .05, then the sample size required to estimate this proportion can be given by the following formula:

n = (z/2E)^2 where

z = value from the standard normal distribution associated with a 90% confidence interval

E = the margin of error

The sample size we would need is: (1.645/2*.05)^2 = (16.45)^2 = 270.65 ~271 farms!

To do this we have to make some assumptions:

Since we don't know the actual proportion of dairy farms that abuse animals, the most objective estimate may be 50%. The formula above is derived based on that assumption. (if we assumed 90% then it turns out based on the math (not shown) that the sample size would have to be the same as if we assumed that only 10% of farms abused their animals, which gives a sample size of about 98 or way more than 11). This also assumes normally distributed data. But to calculate anything, we would have to depend still on someone's subjective opinion of whether a farm was engaging in abuse or not.

I'm sure the article that I'm referring to above was never intended to be scientific, but the author should have chosen their words more carefully. What they have is allegedly a 'random' observation and nothing more. They have no 'empirical' evidence to infer from their 'random' samples that these abuses are 'agribusiness-as-usual' for the whole population of dairy farmers. While MFA may have evidence sufficient for taking action against these individual dairies, the standard should be set much higher in order to support a larger role for government in animal agriculture, which seems to be the goal of many activist organizations.

Note: The University of Iowa has a great number of statistical calculators for doing these sorts of calculations. The sample size option can be found here. In the box, just select 'CI for one proportion' Deselect finite population ( since the population of dairies is quite large at 75,000)then select your level of confidence and margin of error.

References:

Profits, Costs, and the Changing Structure of Dairy Farming / ERR-47
Economic Research Service/USDA Link

"Governor Patterson Shut Down This Dairy", Jan 27,2010. OpEdNews.com

5 comments:

d0wn3r said...

Very deep post you have in here. Just like what I always believe until now, do not, I repeat it, do not rely yourself on the statistic. It just a number that can be change anytime and anywhere. For me, statistic just can give to us an approximity, an average condition, not a conclusion. If we want to know about something more deeply, conduct qualitative research not quantitative (statistical) research. Do you ever googling for a word mhmhmh? Why?

CrisisMaven said...

Very educating! I see you are interested in statistical research. I have put one of the most comprehensive link lists for hundreds of thousands of statistical sources and indicators on my blog: Statistics Reference List (http://crisismaven.wordpress.com/references/). And what I find most fascinating is how data can be visualised nowadays with the graphical computing power of modern PCs, as in manyof the dozens of examples in these Data Visualisation References (http://crisismaven.wordpress.com/references/references-subjects-covered/data-structuring/data-visualisation-references/). If you miss anything that I might be able to find for you or you yourself want to share a resource, please leave a comment.

burosys said...

Burosys Furniture is the best place to buy modern and contemporary furniture. We offer high-quality office chairs, office furniture, workstations, conference table, desking systems, canteen furniture, frezza italy, italian furniture and more in our innovative furniture collections.
To view the range of products visit:
http://www.burosys.com

Matt Bogard said...

Thanks Chris!

Matt Bogard said...

Down3r: I use qualitative approaches a lot. But I still think the utility of quantitative techniques based on statistical theory is that you can determine if your results are just due to chance or if they are significant. This goes way beyond just looking at averages. But I agree there is much utility in qualitative approaches for explorotory analysis and in situations where data may not lend itself well to certain statistical techniques. Even in the article I mentioned, I admit qualitative analysis may be more important, but the author tried to make a statistical claim based only on the fact that the observations were random.