Tuesday, September 18, 2007

Of inductive logic, perceptions, and natural languages

He reads this and hates all men. Frivolous as it seems, I am tempted to ask, why not just Pakistani men?

The problem is, of course, central to all of inductive reasoning. How strong is the inductive link in your generalization? The solutions are very context-specific. And the interpretation of context is very experience-specific. One of the main irritants with life is that statistical inference is not readily obtainable. Hence, one's experience becomes one's truth. And we all have different versions of the truth - each one, a priori, as true as the others. Post-modernism suddenly seems attractive.

Given the sentence "Rand is popular in the girls' hostels", how do you interpret it? Does the speaker want to suggest that Rand is more popular than unpopular in the girls' hostels? Or does he want to say that Rand is more popular in the girls' hostels than in the boys' hostels.

Mathematically, let's define a threshold of popularity, say x% readership, and denote the reader base of Rand (expressed as a percentage) among hostelite girls as P(h-girls), among hostelite boys as P(h-boys), and among non-hostelite girls as P(nh-girls). What does the speaker want to say by "Rand is popular in the girls' hostels" ?

A) P(h-girls) > X : Rand is popular on an absolute scale, the dictator's interpretation)
B) P(h-girls) > P(h-boys) : Rand is more popular among hostelite girls than among among hostelite boys
C) P(h-girls) > P(nh-girls) : rand is more popular among hostelite girls than among non-hostelite girls.

The above question is not rhetoric - all readers are encouraged to answer A, B or C in the comments, along with reasons if they have any. The results, as we shall see in the next post, will probably have some insight for Statistical Natural Language Processing, which, incidentally, is roughly the field I worked on in my undergraduate final year project.

p.s. : Of course I was kidding about postmodernism. It's a pile of garbage, worth only our collective contempt and ridicule.


doubtinggaurav said...

I think the interpretation is a matter of semantics.

And testing the veracity of assertion is a simple matter of taking survey of girls hostel in IIMA.

zen babu said...

Semantics is not universal, it may be a statistical phenomenon - this is one of the central foundations of statistical NLP (as opposed to classical NLP)

I'm of course more interested in knowing the statistical distribution of the interpretations. Once that is established, and I am still interested in this, a survey may just be conducted once the second term starts.

Jai_Choorakkot said...

I'd go with option A too.

If the sentence was:

Rand is MORE popular in the girls hostels,
I'd lean to option B.

If the sentence was:

Rand is MORE popular among hostel girls,
I'd move to option C.

With the original article, the perception of that behaviour being more male than Pakistani may have led to that conclusion.


Jai_Choorakkot said...

PS: on the original post, interesting that just prior to that he posts on the Duke university lacrosse case, and there he is merely angry. A very general and unspecified anger.

He doesnt appear to hate:
- all strippers
- all DAs
- even that specific stripper
- even that specific DA.

and the loss to the lacrosse players was considerably more than what that lady wen thru at the consulate.


doubtinggaurav said...

Lacrosse, Pakistan, Strippers, Anger !!

Mein kahan hoon, mein kaun hoon
Yeh sab kya ho raha hai ??

Sriram said...

Ritwik, setting aside how the wording sucks(?), what I intended to mean was somewhat like 'A'. That is not what Gaurav meant I think. Taking the x% which you suggest, I define x as n(girls who like Rand)/n(girls who have read Rand). Gaurav meant the denominator to be n(all girls). This 'x' is significantly higher than what you'd arrive at for men. Like you say, we need to depend on our own experiences in the absence of stats. I have not met any man (not boy, meaning the chronological age) who has good things to say about Rand. On the contrary I know of girls who reread Fountainhead or Atlas Shrugged when they get bored!

Looking forward to your series on logic and fallacies.

zen babu said...


You'd be surprised to know the number of men who absolutely swear by Rand. We call them libertarians. (Actually that's a stupid generalization too, but one that I like immensely)

Anyhow, I'm not planning a series on fallacies - am not a cartel libertarian. The series on logic, of course, is due.

Sriram said...

Ritwik, thats a bit counter intuitive. Men, in my opinion cannot stomach female sexual liberation. So I guess men hate atleast that part of her novels. Again another generalization. :)

doubtinggaurav said...

Dear Sriram,

Perhaps you are confusing Ayn Rand with Nancy Friday ?

Venu said...

If the point of the post was to illustrate the Humean problem of induction, then I am not sure your Rand example was very relevant. The problem of induction is orthogonal to the problem of natural language understanding or interpretation, i.e. the problem persists regardless of the vagaries/ambiguities of natural language. There would still be a problem of induction even if you were dealing purely in first-order logic.

zen babu said...


Yes, definitely the two are orthogonal. I moved from one to the other through the common link of statistical inference based on experience/common sense, with reference to a particular quote that I had come across while discussing something else on somebody else's blog.

The Rand example was to just move into interpretation and statistical NLP, which I plan to write about in the recent future.