1. STATISTICAL SYLLOGISM applies a statistical generalization about a "reference class", F, to a specific member of the class, m. For example, a statistical syllogism may go from 'Most Fs are G' and 'm is an F' to 'm is a G' (positive form), or from 'Few Fs are Gs' and 'm is an F' to 'm is not a G' (negative form). Instead of 'most' or 'few' the statistical premise may use a phrase like 'all or most' or 'few if any'. Often the quantifier is a quantitative one, such as '85% of all Fs are Gs', or 'at most 20% of Fs are Gs'. If the statistical premise is obtained by generalizing from a sample, it may specify a margin of error, as in '80% +/- 5% of all Fs are Gs'.
The most obviously correct examples of statistical syllogism involve items selected at random from the reference class. If all you know about a particular marble is that it was selected at random from a large collection of marbles of which at least 95% were red, then it seems very reasonable to think that the marble is red. If all you know about a particular playing card is that it was dealt at random from a deck in which fewer than 8% of the cards were Aces, it seems reasonable to suppose that the card is not an Ace.
In cases where nothing quite like random selection is involved, a statistical syllogism may still be in order if the individual in question can regarded as a "typical" member of the reference class. If most German shepherd dogs are aggressive, and there is a typical looking shepherd coming down the street toward you now, you should probably expect it to be aggressive even though it was not randomly selected from all the German shepherds in existence.
One of the main sources of fallacy in statistical syllogism is an inappropriate choice of reference class. Every individual belongs to indefinitely many different classes, and sometimes statistics on two of the classes to which an individual belongs are on opposite sides of the 50% mark. Most physicians do not smoke, but most people who grew up on tobacco farms do smoke. If we know that Fritz is a physician who grew up on a tobacco farm, it would be fallacious for us to conclude either that he probably doesn't smoke (because he's a physician) or that he probably does (because he grew up on a tobacco farm). Each reference class is inappropriate because Fritz is also known to belong to the other. If we happen to have statistics on the intersection of these two reference classes (that is, on physicians who grew up on tobacco farms), we should use that as the reference class and draw the conclusion (if any) supported by those statistics. If not, we should avoid using statistical syllogism to arrive at a conclusion about whether Fritz smokes. Otherwise we will be committing the fallacy of ignoring relevant evidence.
Note that it doesn't take actual statistics on an alternative reference class to make a given reference class inappropriate. As long as there is reason to suspect that the statistics are quite different for another class to which the individual belongs, it is unsafe to assume that the individual is a "typical" member of the class for which we do have statistics. If a study showed that most cocker spaniels are docile, it would not be safe to conclude that Laddie, an abused cocker spaniel, is docile. The information that Laddie has been abused gives us reason to doubt that he is a behaviorally "typical" cocker spaniel, even if we don't have actual statistics on the docility of abused cocker spaniels. If we ignore this information and make the inference anyway, we will be ignoring relevant evidence.
2. GENERALIZATION goes from statistics on specific members (the "sample") of a class (the "population") to a statistical conclusion about the population as a whole. The premises state the "composition" of the "sample" with respect to a certain attribute, that is, the proportion of sample members that have the attribute, and the conclusion states that the population has the same or about the same composition as the sample. For example, pollsters may go from the premise that 63% of a large, random sample of voters prefer candidate A to the conclusion that between 60% and 66% of all voters prefer candidate A. Quality control experts may go from the premise that 95% of the whatsits they randomly selected and tested were OK to the conclusion that about 95% of the whatsits in the lot from which their sample was taken were OK.
Generalization is the subject of oddly conflicting attitudes. Many people are quite willing to generalize on the basis of a few cases or even a single case, thereby committing the fallacy of small sample or hasty generalization . Others (or perhaps in some cases the same people) are very suspicious of the conclusions statisticians draw from much larger numbers of cases. How, they wonder, can anyone claim to divine the opinions of 100 million people from interviews with a couple thousand? How can anyone declare that candidate A will win when only 5% of the votes have been counted? Although such questions are often intended to be rhetorical, the supposedly obvious answer that "It can't be done!" is false. In cases where the "sample" is randomly selected from the population, the generalization from sample to population can be backed up by a complex argument in which all but one of the steps are deductive, and the one nondeductive step is an intuitively correct statistical syllogism.
The foundation of the reasoning lies in mathematics, which is deductive. The relevant mathematical result, qualitatively stated, is that most of the large samples that could possibly be drawn from a population have compositions close to that of the population. More specific results can be given for particular sample sizes. For instance, it can be shown mathematically that at least 95% of all possible samples of size 1600--a typical size for the professional polls-- have compositions within about 2-1/2% of the composition of the population. It doesn't matter whether the population size is 100 thousand, 100 million, or 100 trillion: the ratio of matching samples to possible samples is essentially constant for populations much larger than the sample. Moreover, the margin of error can be reduced by increasing the sample size. At least 95% of all possible samples of size 6400 have compositions within about 1-1/4% of the composition of the population. Each quadrupling of the sample size reduces the margin of error by half. (This assumes a "confidence level" of 95%, the usual figure for such studies: if the margin of error stated in the conclusion is held constant, increasing the sample size increases the confidence level.)
The statistical syllogism underlying a generalization goes something like this:
The claim that P's composition is close to S's follows deductively from the above conclusion.
We can determine by observation what S's composition is and reason deductively as follows:
Thus we can infer the approximate composition of an arbitrarily large population from the composition of an absolutely large random sample, even if the sample is small relative to the population size, by reasoning in which the only non-deductive step is a statistical syllogism.
In cases where a truly random sample cannot be obtained, we must assume that our method of sampling is not biased. A sample is truly random only when all members of the population have the same chance of being selected for inclusion in the sample. Typically this is not the case. Some members of the population will have a greater than average chance of being selected and others a less than average chance for reasons related to our method of sampling. The composition of the sample can be safely projected onto the entire population only if it is safe to assume that the characteristics that determine an individual's chance of being selected for the sample are unrelated to the attribute we are studying. In a political preference poll, selecting our sample from among homeowners will normally introduce a bias that makes the sample worthless because homeowners are on average wealthier than non-homeowners, and the more wealthy tend to be more conservative than the less wealthy. The use of such a biased sample is a fallacy regardless of how large the sample is, just as the use of a small sample is a fallacy even if the members were randomly chosen.
3. ANALOGY draws a conclusion about one thing from a premise about some other thing together with premises asserting other similarities between the two things. Given that individuals a and b both have characteristics F1, F2, etc., and that b also has G, we might conclude that a also has G. The two key factors in analogy are the extent of the relevant similarity between a and b and the absence of relevant differences between them. There is no point in counting similarities and differences. One similarity or difference may be extremely relevant and many others almost entirely irrelevant. The relevance of a similarity or difference must be judged on the basis of general background knowledge, including especially our well-established theories about things of the kind(s) we are reasoning about, about the kinds of causal processes there are in the world, and so forth. Assessing the strength of a good analogy is usually more difficult than assessing the strength of a generalization or a statistical syllogism, but bad analogies are often conspicuously bad, highlighting obviously irrelevant similarities and/or ignoring relevant differences.
4. SIMPLE INDUCTION. Goes from premises about some members of a class (a "sample") to a conclusion about some other member of the same class. Thus simple induction goes from premises like those of a generalization argument to a conclusion like that of a statistical syllogism, without involving any statistical statement about the class as a whole. As in the case of analogy, the premises and conclusion of a simple induction are at the same level of generality, in contrast to statistical syllogism, which goes from general to specific, and generalization, which goes from specific to general. The difference is roughly this. In analogy, the strength of the inference to the conclusion that a has G depends on a's having many things in common (being an F1, an F2, etc.) with one other individual that is known to have G. In simple induction it depends on a's having one thing in common (being an F) with many other individuals most of which have G.
COMPARISON 1: Generalization versus Statistical Syllogism. These
two types of argument go in opposite directions. Generalization has a conclusion
that is more general than any premise, while statistical syllogism has
a premise that is more general than the conclusion. Examples:
| Generalization | Statistical Syllogism |
| My cat Sloth likes tuna | Most cats like tuna |
| Al's cat Zubin likes tuna | Sloth is a cat |
| etc. | -------------------- |
| -------------------------- | Sloth likes tuna |
| All or most cats like tuna |
| Generalization | Statistical Syllogism |
| Most of the Iowans I know like corn | Few dogs lack fleas |
| -------------------------------------- | Spot is a dog |
| Most Iowans like corn | -------------------- |
| Spot has fleas |
COMPARISON 2: Generalization versus Simple Induction. Both of these types of induction involve inference from the makeup or composition of a sample. That is, what is given in the premises is that certain specific members of a class, say A, have been observed, and that all, or most, or a certain percentage of these observed As are Bs. The conclusion drawn in generalization is that some correspondingly high (or low) proportion of all As are Bs. In other words, a conclusion is drawn, on the basis of the makeup of the sample, about the makeup of the entire class from which the sample was taken. By contrast, in simple induction, a conclusion is drawn about a single case that is in the class but not in the sample, either that it is a B, or that it isn't. Examples:
| Generalization | Simple Induction |
| All of the cats I've known love fish | All of the cats I've known love fish |
| --------------------------------------- | ------------------------------------ |
| All or most cats love fish | This new cat will love fish. |
| Generalization | Simple Induction |
| 80% of the students polled dislike pop quizzes | 80% of the students polled dislike pop quizzes |
| --------------------------------------- | Betsy is another student |
| About 80% of all students dislike pop quizzes | --------------------------------------- |
| Betsy dislikes pop quizzes |
Comparison 3: Statistical Syllogism versus Simple Induction. Both of these types draw a conclusion about a single case from premises about a class. The difference lies in the relationship of that individual to the class of cases described in the premise(s). If that class is one to which the individual belongs, the argument is a statistical syllogism. Otherwise, the argument is a simple induction. Examples:
| Statistical Syllogism | Simple Induction |
| Most apples are bland | Most of the apples I've tasted have been bland |
| This is an apple | This is another apple |
| --------------------------------------- | ------------------------------------ |
| This is bland | This is bland |
| Statistical Syllogism | Simple Induction |
| Most of Al's dates have been smart | Most of Al's dates have been smart |
| Al has dated Judi | Al is going to date Tracey |
| --------------------------------------- | --------------------------------------- |
| Judi is smart | Tracey is smart |
Comparison 4: Simple Induction versus Analogy. In analogy, as in simple induction, the conclusion is about one item (call it a). But in analogy, the premises are also about a single item (b) that is said to be similar in certain ways to a, and to be (let's say) a G. The conclusion then says that a is also a G. In simple induction, the conclusion that a is a G is based on the fact that numerous other items that, like a, are Fs, are also Gs (or that most of them are). So the difference
between simple induction and analogy is that between basing the claim that a is G on many other similar cases (simple induction) and basing it on one very similar case (analogy). Examples:
| Analogy | Simple Induction |
| Jim and Bill have similar interests, talents, | Most of the student athletes I know are liked |
| aptitudes, habits, personalities, etc. | by their classmates |
| Jim is liked by his classmates | Bill is another student athlete |
| --------------------------------------- | ------------------------------------ |
| Bill is liked by his classmates | Bill is liked by his classmates |
| Analogy | Simple Induction |
| Chickadees are closely related to titmice and regularly eat the same kinds of seeds | All birds to whom I've offered peanut butter have eaten it |
| Titmice will eat peanut butter | I will offer peanut butter to chickadees |
| --------------------------------------- | --------------------------------------- |
| Chickadees will eat peanut butter | The chickadees will eat the peanut butter |