On Philosophy

November 30, 2007

Two Kinds Of Claims

Filed under: Epistemology — Peter @ 12:00 am

Yesterday I briefly introduced a distinction between two kinds of claims. On one hand, I said, we have claims about the structure of a model of the world, and on the other we have claims about the world arising from the model. For example, the claim that there are forces is a claim about the Newtonian model, but the claim that things fall at 9.8 m/s2 is a claim about the world that might be deduced from the model. Previously I was specifically interested in identity, and asserted that identity claims are about the model and not the world. This solves a number of problems associated with them, because, taken about the world, identity claims seem trivial and uninformative. But, at the level of the model, claims about identity can be genuinely interesting because they affect which objects our model contains. That is simply one application of the distinction, but it illustrates that what we take a claim to be about, in this sense, may affect how we handle it, one of those differences being, as I will elaborate on, epistemological, with claims on one side of this divide being handled differently than those on the other.

Let’s begin with claims that are essentially about the world, but which follow from our model, meaning that, according to the model, the claim can be the case if and only if some particular fact holds in the model. For example, going back to Newton, something can only be under acceleration if there are forces acting on it, and for any particular acceleration and object there can only be a single vector sum for the forces acting on it. Thus we might justifiably treat the truth of such claims as being essentially tied up with deduction. Either we can deduce them from our model and the state that we think it is in, or we can deduce facts about what state the model must be in from them. (Given that we are working with some particular model.) Another feature of such claims is that they are verifiable, at least in principle. If we believe something to be accelerating then we can check that hypothesis with careful measurement. In fact this is the easiest way to distinguish what counts as claims about the world and about the model: if we can check them with uncertainty arising only from the nature of our measurements then they are claims about the world, but if they are always, even in principle, some distance from being able to be completely confirmed then they are claims about the model.

(Complicated digression: it follows from this that what may be claims about a model may change into claims about the world depending on our instruments, at least in some sense. I think this is essentially correct; consider, for instance, a model of the world that is atomic. And eventually instruments will be developed that allow people to look at these hypothetical atoms directly. Obviously when they make the observations they will take what they see to be the hypothetical atoms. But what has really happened is that they have discovered a new domain of phenomena and extended the old atomic model in a one to one correspondence to account for it. However, what is called “atoms” has changed from a part of the model to a part of the world simply because what is now called atoms are the things observed with the instruments, not primarily the hypothetical entities of the theory, and if they act in ways that disagree with the entities previously called atoms we will revise the atomic theory in response. Which proves that they are no longer part of a model, because when your model doesn’t correspond to reality you conclude that the entities defined by your model don’t exist, given that they were defined completely by the model, and you introduce new entities, possibly similar to the old ones, with different definitions. But when you have turned claims about atoms into claims about the world it is no longer possible to conclude that the atoms of the theory don’t exist, you can only revise the theory to make them act differently.)

In contrast, claims about the model are never deductive. We don’t deduce which model is correct, rather we infer it by determining which model best matches up to reality. And, obviously, we can never be completely sure that we have found a model that perfectly matches reality because there are always more particular facts about the world to examine which it may very well be inconsistent with. And, as mentioned above, claims about the model are also never directly verifiable; we can never find the entities that exist in our model themselves, we can only find the phenomena that we think correlate with them. None of this makes models, so described, undesirable. Obviously we would prefer perfect certainty, but when dealing with the world perfect certainty is impossible, and working with models in this way is simply the best we can do. Of course certain examples may seem to imply that deduction does play a role in models. For example, from an atomic theory of liquids and Newtonian mechanics we can “deduce” laws about water pressure. Thus we may think that a new model about water pressure has been arrived at by deduction from our previous ones. But this is not what has happened at all, all that we have done is simply apply the models we had already developed, the facts about water pressure were already contained within them in a latent form, just waiting to come out. Of course by extending the model we may arrive at more claims about the world, and thus more ways of testing it, but that doesn’t mean that there is anything new in the model. Furthermore it should also be clear that there is necessarily some core part of the model that cannot be arrived at by deduction and extension in this way, which must be confirmed by tests, and thus which must always be necessarily less than completely certain.

Another difference between claims about the world and the models we use to understand it is in the level of ambiguity that can exist. When it comes to claims about the world ambiguity is, in principle, impossible. Assuming we have made a well-formed claim we could go out and either confirm it or disprove it. Thus, again in principle, it is never an open question whether some claim about the world is true or not. In contrast it is quite possible for there to be systematic ambiguity when it comes to our models of the world. There might be two different models of the world that give rise to the same claims about the world (in fact there will necessarily always be such alternate models). And we cannot, even in principle, decide which of these two models is “right”, because there really is no fact of the matter at all about one of them being right, as they are both equally good models of the world. Still, in such situations we are generally going to prefer one of the models to the other, for reasons that have nothing to do with one of them being right and the other wrong. First we will always prefer the stronger model, the one that makes more claims about the world, both because stronger models are more useful, and because, by making more predictions, they are easier to confirm or refute, and thus we can be more certain of them. If two models are equally strong then we will tend to prefer the simpler and easier to conceptualize one, for the practical reason that they are easier to work with, and such practical reasons factor heavily into why we develop such models in the first place.

Thinking about things in this way has a number of implications, one of which is that we should never attempt to devise models though deduction from other principles, nor should we defend them in that way. A simple glance at science will show you that the best models there are usually justified by their results, and not by any simpler principles that they follow them. Indeed many successful models contain idea that may seem intuitively absurd, although that doesn’t stop them from being good models. And thus if we were committed to deducing our models from simpler principles we might be moved to reject them, illegitimately, because of the absurdity of the principles that would be required to arrive at them. The problem, as I see it, is that in philosophy part of our task seems to be to come up with such models and use them to explain various parts of the world. Indeed in this very post I am working with a particular model of theories and knowledge. Thus we shouldn’t be looking to prove our philosophical theories, as so many try to do, rather we should be constantly checking them against reality to see if they line up with it. A second consequence of this way of thinking about claims is that it illustrates that not all questions we might direct at the model need answering. Remember the model is not itself justified, and thus that we need never answer questions about why the model is the way it is, why a single object models two phenomena that may have seem distinct, or why a particular law holds. Rather the model has to submit to inquiries about its testability and whether it actually leads to claims about the world. It is only in the context of claims about the world that we can legitimately ask why they are the case, why an object is particular shade of red, or why two phenomena are correlated. And if our model purports to explain those things it must answer such questions or be revealed as falling short.

November 29, 2007

The Theoretical Nature Of Identity

Filed under: Epistemology,Metaphysics — Peter @ 12:00 am

Ordinary discourse doesn’t include many appeals to the notion of identity, and certainly not any precise ones, even though truths about identity often surface in the form of claims that two apparently distinct things are actually the same. Thus we might turn to logic for a precise definition of identity. There it is treated as just one two place relation among many, one that holds between every object and itself and which never holds between two different objects. Such a description removes all the mystery surrounding identity, but it is also relatively uninformative. For example, it would be nonsensical to ask why two objects are identical, because that would be equivalent to asking why one object is identical to itself. And clearly there is no reason why an object is identical to itself, that is just a brute fact. Problems also arise with the logical definition of identity when considering possibility and necessity, and in other contexts, which suggest that there is more to identity than the logical relation captures, such that logical identity only reflects the nature of identity in certain limited circumstances.

Given that the logical definition of identity has failed up perhaps we are forced to turn to our ordinary use of identity. But when it comes to our ordinary use of terms such as “the same”, which seem to be those from which the idea of identity is extracted, it doesn’t seem much like a relation at all. Whenever we say that two things are the same there is an implicit understanding that they are the same something. For example, we might say that two people drive the same kind of car, uniting the two distinct objects under the umbrella of a single kind. Now it might be objected that this is simply too broad a conception of identity, that all we really seek to understand is the way in which a brick viewed at one moment is the same as it viewed in another moment. But, even here, we must stipulate that they are the same brick where being the same brick brings with it the idea of a temporally extended object. Otherwise we could object that what we saw in those two moments were really distinct objects, distinct brick-moments. And the same applies when we try to talk about the same brick seen by two different people at a single moment. It could be claimed that there are two non-identical brick-presentations under consideration, and thus to say that they are the same we need to say that they are the same brick, where this time brick brings with it the idea of an observer independent object that can be presented in a number of different ways. Under this conception identity seems to be a concept that is essentially tied to a way of dividing the world into a number of equivalence classes, such that saying two things are the same is really just to say that they belong to the same equivalence class under some division. And thus which things are identical and which aren’t seems completely a product of the way we divide the world, which is itself arbitrary, and thus it is hard to see how claims about identity could be significant, which is the real problem.

That seems odd, because there certainly appear to be important facts about identity, but we can’t easily get back to such facts and something like the logical conception from this starting point. We can’t for example, simply try to redefine identity without any such equivalence classes. Well, we could, but identity defined without such equivalence classes to bring together things that we might otherwise distinguish between leaves us with virtually no cases where identity holds at all. Under identity, so defined, the only time we might legitimately say that two things are identical is when we are referring to the same particular experience of something and saying that it is identical to itself. This would make identity simultaneously trivial and useless. Nor can we recapture the logical notion by simply picking one way of dividing the world into equivalence classes as “right”. Admittedly such a division would fix things to some extent, but the decision about which division is “right” seems itself completely arbitrary. Some might even make the case that the universe is best conceived of as a single object, and that what we think of as different objects, such as different particles at different moments of time, are really just properties of this universe-object, the property of it containing a particular particle in a particular place at a particular time. Certainly we can’t say that this way of logically describing the universe disagrees with our experience, it just makes talking about it a little difficult, and it completely does away with the possibility of any identity relation, since there is only one object that it might apply to, and is thus rendered meaningless simply because it doesn’t distinguish between objects.

A third way to bridge this gap, which we haven’t considered previously, is to consider equality as tied up with language, something that hasn’t come into play yet. The interesting claims involving identity, we might note, are those where we say that two things designated by different names are identical. For example, we might assert that the richest man in Europe is the tallest man in Europe, which could be to claim that those individuals are identical. And thus it might be supposed that identity is really a claim about what two terms refer to, namely the same thing, which couldn’t be properly captured by the suggestions above simply because they had no way to talk about referring. There does seem to be a kernel of truth in this suggestion, but it too has its failings. First of all it is still based on a kind of ordinary understanding about identity, and so still seems vulnerable to what it means for two things to be “the same thing”, which plays a key role in the definition. Again it seems that a division of the world into equivalences classes is necessarily involved. And in this case the division seems even more complicated. Certainly, for example, the richest man in Europe hasn’t been the richest for exactly as long as he has been the tallest, thus we run into problems in saying what exactly “the tallest person in Europe”. Does it mean the time extended person who is the tallest now, does it refer to the person-moment that is tallest now, or does it refer to all person-moments in all times that are tallest, regardless of whether they belong to the same time extended person? To make it true and meaningful we must pick some division, but, again, how to divide the world seems arbitrary. Secondly, it seems to fail to capture some of the ineffable essence of claims about identity. When we claim that the morning star is identical to the evening star we mean to say something to the effect that, roughly, the two objects under consideration are really one object. But so far none of the possibilities entertained have let us make this kind of assertion, because they don’t allow us to talk about two objects in one breath and a single object in the next.

To solve some of these problems I think we need to turn to why we talk about objects in the first place. The idea of objects, I claim, is a device that exists simply to make conceptualizing and theorizing about the world simpler. We divide the world up into objects and assign them properties, and then on the basis of this division and its laws we make predictions and check how well our model matches up to observations. Thus objects are from the beginning our invention, they don’t exist outside of us to find, even though talk about objects can be considered in the domain of objective fact. Thus we can make a distinction between talk about the model and talk about relationships between the model and the world as observed. To say that an object is red or at a particular location is to make a claim about the world, expressed through a relationship between statements about our model and the world. On the other hand, to say that a certain law holds of objects is strictly talk about the model. Of course we can make observations that contradict the proposed law, but we can never observe the law itself. And thus it is more accurate to say that our observations have revealed that the proposed model, including the law, doesn’t accurately reflect the world, rather than saying that they show the law doesn’t exist or is false. We can put this distinction to a number of purposes. For example we might point out that world-model correspondences can be given explanations in terms of the model, we can say why an object has a particular color or location by appealing to the structure of the model and its laws; but we can’t explain brute facts about the model itself, such as its laws, in the same way. But obviously this is tangential to the matter at hand.

So, to return to identity, allow me to simply assert that identity is a claim about the model, not about the world. It is saying that our model contains only one object that will be used to explain what might be thought of as two distinct phenomena, or what were explained using two objects in another model. Because identity fixes the model this explains why we have talking about identity from within the model; once you have decided what your objects are it is fairly useless to go over those facts again. Of course even under this understanding of identity there is still some arbitrariness about what your objects are. For example, we could double the number of objects by modeling the world with two objects wherever previously we had one, with one object accounting for all observations made from galactic north and another for the observations from galactic south. However I would point out that there is nothing “wrong” with such models, they are just a more complicated way of expressing essentially the same facts. But, for the sake of convenience, we tend to go for the fewest number of objects, and thus the maximum amount of identity, arguing that we should use a single object whenever possible, so long as contradictory properties (A and ~A) aren’t assigned to that object. Most importantly, of course, is simply that, regardless of whether there is arbitrariness here, claims about identity are significant, not as facts about the world, but as facts about the model, which makes arbitrariness somewhat irrelevant. I would elaborate further, but I fear I have gone on too long for one day already, so I will leave the rest to the reader (unless some especially interesting complications occur to me later).

November 16, 2007

Bootstrapping Our Way To Knowledge

Filed under: Epistemology — Peter @ 12:00 am

It would seem that part of the justification for any claim contains, in some way, a theory about how knowledge and justification work. This might not seem completely obvious, but you can always uncover such claims simply by asking “why?” enough times concerning the justification for the claims you encounter. For example, suppose that we know it is snowing outside (or at least think we know that). But how do we know it? Well, we have the visual experience of snow falling, and we know that when we see snow falling usually it is the case that snow is actually falling. But how do we know that this correlation is itself a reliable one? We might say that we have observed the correlation to hold in a large number of cases. But why does that entail the correlation probably holds now? At a certain point we are just going to have to assert that this is the way that knowledge works, that some principle of generalization (and possibly other principles as well) justify claims in a kind of primitive way, such that no justification can be given for those claims.

This might seem like a problem, and it would be a problem if we just let the matter rest there. Who in their right mind would let all our intellectual accomplishments rest on a foundation that was essential a matter of fiat? Obviously we can’t simply find some other principles to justify them from, because then the same questions will arise regarding those principles, and so on. This vicious circle is reminiscent of the problem of induction, which presents us with a similar quandary, namely that to justify induction we seem to require induction, and here it would appear that to justify a theory of knowledge we require a theory about knowledge. Unfortunately this problem isn’t equivalent to the problem of induction, and the solutions to that problem, which usually involve simply relaxing our standards regarding how induction is to be justified, won’t work here.

The solution, if there is a solution, must involve bootstrapping, a process by which we move from knowing nothing about knowledge (what counts as justification for a claim, and so on) to knowing some things about knowledge. And obviously that is only possible if we have somewhat relaxed standards regarding knowledge, meaning that we aren’t required to deduce facts from some absolutely certain foundation in order for them to be knowledge, because there is no absolute foundation to proceed from in this situation. (Of course if you hold knowledge to such strict standards you probably haven’t been able to resolve the problem of induction, in which case that would be a much more pressing problem than this more fundamental question, and so I assume that any reader of this piece is willing to grant me this if they are with me so far.) I propose that, instead of trying to devise a way conduct such a bootstrapping, we simply look at how people have actually come to have some idea about what knowledge is in a very general way, with the idea that if we haven’t successfully bootstrapped ourselves into a theory about knowledge that is correct in at least some significant ways then this entire enterprise is doomed, because to come to a conclusion about how such bootstrapping works we must lean on certain knowledge. This is not to say that those pieces of knowledge are part of the actual bootstrapping process, but they are a part of how we reach conclusions about it. If we really didn’t have any knowledge we would have to first bootstrap ourselves into some knowledge about how knowledge works before we could come to those conclusions (thus the bootstrapping necessarily precedes any knowledge about it).

One thing people generally rely on in a state of complete ignorance about knowledge is instinct, as we are hard-wired to draw certain conclusions from evidence. And indeed before any theory about knowledge was developed people were probably proceeding on these instincts to distinguish between trustworthy claims and those that could be discarded. The problem with instinct though is that it is essentially a black box as far as we are concerned (it is only much later in our intellectual development, with a number of complicated theories under our belt, that we can return to instinct and explain it and justify it by arguing that it has survival value). It can’t be the foundation for a theory about knowledge because the there is no justification for the instincts themselves. The only option left then for the people in this situation is trial and error. Suppose that we begin with the hypothesis that some beliefs are better to have than others, and that we will call the better beliefs knowledge. Obviously this is easily testable by trial and error, because if you actually try to treat all beliefs as of equal value then you will find yourself running into a lot of doors, which clearly shows that treating all beliefs as equal is a bad idea. And naturally a number of specific beliefs could also be tested in this way, but this is not really what we are after, we are after a theory about what differentiates these good and bad beliefs, and ways to arrive at the good beliefs and to avoid the bad beliefs. Again people in this situation might proceed by trial and error, considering a number of different strategies for belief formation and seeing which ones form more good beliefs than bad beliefs (over a number of generations probably, passing down what they have learned from one to the next).

Obviously for this to work the beliefs produced must themselves be testable, or, for obvious reasons, it becomes impossible to make judgments about which ways of forming beliefs are superior. And this might appear to be a problematic move in its own right, because doesn’t it presume that we can know whether the specific beliefs produced are good or bad? Isn’t it possible, in our position of complete ignorance, that our perceptions are so wildly mistaken that we can’t make accurate judgments about which are good and bad? Indeed that is a live possibility if we take these claims as really about the external world in some way. But there is no need to do this, instead we can take them as elliptical statements that are really just about which perceptions we will have. This eliminates any possibility of error at this stage in the game when it comes to judging which beliefs are good and which are bad. Of course, later in the process of bootstrapping, the claim that an external world exists and that the beliefs under consideration are really about it will emerge, either as a claim on its own that is considered knowledge or as part of our theory about knowledge which is justified simply because it is a very successful way at arriving at good beliefs (the latter more accurately reflects how we actually seem to treat the claim).

Obviously, as with solutions to the problem of induction, theories about knowledge itself arrived at by this bootstrapping process can never be completely certain. But that seems to be the fate of knowledge in general, while we can justify claims we are forever unable to achieve perfect certainty. On the other hand, the only way that theories about knowledge itself developed in this way could turn out to be substantially faulty, at least in the domain of these testable beliefs, would be if certain kinds of generalities didn’t exist, if there were no patterns about which processes lead to more good beliefs than bad beliefs, or if which process will be successful is in constant flux. But if such possibilities were indeed the case not only would bootstrapping fail to get anywhere (and thus the fact that such bootstrapping has already occurred is evidence that they aren’t), but it would be impossible to have any theory about knowledge by any means. And so bootstrapping, while not perfect, is as good as it gets.

The question then arises: how can we extend a theory about knowledge developed in this way, since this is the only way to develop a theory of knowledge from scratch, to non-testable domains? And there we have a serious problem, because if we are dealing with a non-testable domain it is hard to even say what knowledge consists in. To even get bootstrapping started we had to begin with the idea of a good belief in order to draw a distinction between two classes, even though that way of drawing the distinction is itself later left behind (replaced with the idea of an accurate belief, at about the same time that the idea of an independently existing external world appears). But if a domain is really non-testable how can we divide it in this way? Perhaps some definition of accuracy could immediately play the required role, but that would seem to require the ability to establish correspondences between claims and features of that domain, requiring us to have access to it, which in turn would imply that claims about it are indeed testable. But, assuming these problems can be solved, it remains a mystery why we should even care about these domains, because clearly if we can’t make testable claims about them then they can’t affect us in any way (otherwise we would be able to make claims about their effects on us, test them and go from there, which is essentially how we proceed with the ordinary world, starting with perception).

Even if were to simply ignore those problems it would seem that the best we can do is simply to extend the theories about knowledge developed by bootstrapping to cover that domain, which raises yet another difficulty since such theories are going to endorse some kind of generalization from particulars, which requires some kind of access to individual facts to proceed, and which we clearly don’t have when it comes to these domains. All of these problems, taken together, strongly imply that when it comes to things we can’t perceive in some way we can’t have knowledge about them. Which may seem to be contradicting the obvious, since many will claim that we have mathematical knowledge, and it seems quite clear that if some mathematical domain exists that we don’t have direct access to it such that we can test mathematical claims apart from their proof and axioms. But I must question the claim that mathematical theorems are knowledge. What justifies the assertion that a theorem produced by deduction from axioms is correct? Certainly we may endorse the process of deduction, but the axioms themselves are without justification. A better defense of mathematics is to point out that mathematical theorems are often adopted by science for its uses, producing claims that we do endorse as knowledge, and thus that in some way the theorems so adopted must be “true”. The problem here is that for every theorem so adopted there are a number of others, proved from slightly different axioms that are not used, and must therefore be “false”. Consider non-Euclidean geometry, for example. Since a certain non-Euclidean geometry is adopted by physics we might say that it was right. But then all the other variant non-Euclidean geometries were wrong because they said false things about points and lines. Taken at face value this would mean that mathematics is largely a failure at producing knowledge because it produces many more false theories than true ones. Of course that conclusion could be avoided by saying that what lines and points are is simply defined by the axioms of each geometry (a reasonable claim), but then the theorems of mathematics are no longer knowledge about some domain. Rather they are simply a kind of game played with symbols or in complete abstraction from any content, which happens to be useful on occasion; but, by itself, the game is simply a game. (Which is not to put down mathematics, the occasional times when it is useful make it a worthwhile game to play, but it does mean that the game by itself isn’t telling us about anything except the game itself.)

My really worry is, as always, with philosophical knowledge (especially metaphysics). Is it really all testable in some extremely oblique way? Or is philosophy, or some subset of it, simply a game devoid of content, like mathematics, but which, unlike mathematics, never proves useful to anyone but philosophers?

November 15, 2007

Considering “All Crows Are Black”

Filed under: Epistemology — Peter @ 12:00 am

Yesterday I mentioned that deciding what counts as evidence for justifying the statement “all crows are black” is a somewhat of a classical problem case. And it is a problem that I often have in mind when theorizing about what justification from evidence consists in. However, upon looking back at my work, I realize that I have never actually provided the solution I have developed to the problem, even though all the apparatus to solve the problem has been. So today I am going to rectify that, by saying what does and doesn’t justify the conclusion that all crows are black, and why.

But first I suppose I should say what exactly the problem is for those of you just tuning in. I think the best way to approach the matter it is to consider how we logically formalize that statement. The usual way is to write ∀x(Cx → Bx), which asserts that for all objects if that object is a crow then it is black. But that statement is true if and only if ∀x(~Bx → ~Cx), or, in natural language, that all non-black things are not crows. Now let us suppose that some evidence, X, confirms ∀x(~Bx → ~Cx), say observing a number of colored items, none of which happen to be a crow. Furthermore, if evidence confirms a statement it must also confirm all statements that follow logically from it (simply because that is how deduction works, and if deduction was impossible then it might seem that we would be unable to conclude anything). Thus this evidence must also confirm the claim that ∀x(Cx → Bx), that all crows are black. But that seems absurd, since we haven’t even observed any crows how can we say that it supports the claim that they are all black? And, even worse, that same evidence, by similar equivalences, supports the claim that all crows are green, that all crows are blue, and so on. Clearly we have done something wrong, although it is not clear at this point exactly where the problem lies.

Of course one way out of this problem is simply to deny that the reversed version, that all non-black things are not crows, is in fact confirmed by the evidence that we supposed it was. This is an intriguing possibility because it heads off the problem from the start, and validates our usual ways of thinking about formalizing such general statements and the intersection between evidence and deduction. Unfortunately, I just can’t see how to make the claim stand up to any scrutiny. Why can’t a number of colored objects confirm the claim that all non-black objects fail to be crows? Any plausible answer would have to involve some claim that the negative classes, those things that aren’t black or aren’t crows, are in some way illegitimate, that we can’t even meaningfully generalize about them. And that is not a path completely untrodden because such negative classes can cause difficulties in other situations. (For example, consider the union of the set of all crows and the set of all not-crows. The result is the set of all things. But the set of all things is not a set. Since the union of two sets always produces sets either the set of all crows or the set of all not-crows must not really be a set at all. And if one of them has to fail to be a set it seems natural to say that the set of not-crows is the illegitimate one. Furthermore, if we associate properties with sets, as some do, then there must be no such thing as the property of not being a crow, which validates this way of thinking to some extent, but now we have no idea what to do with such things (what do they even mean?).) However, that response opens up a can of worms involving how to deal with negative statements in general, if there is something wrong with them, and I would rather not have to go down that road unless we absolutely have to.

Another solution to the difficulty, specifically the fact that all crows being black and all crows being blue seem to be entailed by the same evidence, is based upon the realization that, in a sense, the claim that there all crows are black is compatible with the claim that all crows are blue (and that nothing is both blue and black) when there are no crows. Given that perhaps the statement “all crows are black” is better captured by a joint assertion that there are crows and that all of them are black. When the logical character of the statement is captured in this way then it is indeed the case that “all crows are black” necessarily excludes “all crows are blue”. Formally we would now write the assertion as ∃xCx ∧ ∀x(Cx → Bx). Such a formalization also prevents us from verifying the statement with evidence that doesn’t include at least some crows. And if those were the only problems we were concerned with then we might very well be happy with this solution. However, this new formalization does not really solve the problems, rather they have simply been hidden away. Consider, for example, trying to determine how some measure of the strength with which the evidence supports the statement. For convenience we can think of support as coming in a range from 0 to 1. Since our claim consists of a logical conjunction it is natural to assume that the support of the entire claim by some evidence is equal to the product of the support the evidence provides for both terms in the conjunction. Now if our evidence contains even one crow then the support that the first term has goes to 1, or at least very close to it. Given that all we need to worry about is how well the second term is supported by the evidence. But the second term is just ∀x(Cx → Bx), the statement that we found so problematic. And we can reason the same way we did about support as we did about whether evidence confirmed the statement, namely that if evidence lends support to ∀x(Cx → Bx) it must also support ∀x(~Bx → ~Cx) to the same degree, and vice versa. And now we have a new problem, quite similar to our original, namely that a single black crow plus a large number of other variously colored objects seems to strongly support the claim, and that this support only gets stronger as we add more colored objects. But that seems quite irrational, what it seems should be the case is that the statement “all crows are black” receive additional support only when we come across another black crow, and that determining how well supported the claim is should be indifferent to how many other colored objects we have observed.

Which is why I said that the solution we have just been considering was a solution in name only, and that it really just hid the problem, because the problem really was in determining which kinds of evidence support general statements in a principled way. I propose that what has really gone wrong here is turning the claim that all crows are black into the statement ∀x(Cx → Bx) or any logical construction involving that one. What we are really asserting when we make the original statement is that the frequency of crows being black (versus being some other color) is 100%. Thus this claim is simply one instance of a family of claims involving such members as “50% of crows are black” and “99.9% of crows are black”. There is no easy way, to the best of my knowledge, to transform these assertions into first order logic, so we are free to make up our own notation. I propose %(a)(b,c); where a is the frequency, ranging from 0 to 1, b identifies the objects that the frequency holds over, and c is the property under consideration (b and c then might be taken to be formulas with one free variable). And there is the additional restriction that whatever we put in for b must not be able to “double count” objects; not only must it pick our the objects to be counted it must be such that if a particular physical object is picked out by b then it is impossible for b to pick out as a separate object something that overlaps with the first. For example, if we allow Cx to be the property of being a crow it must not be the case that a complete bird can be C and that a subset of matter that composes that bird, say the entire bird minus a single feather, can also be C. We must forbid such double counting because it makes talk of frequencies meaningless, we have to generalize over a number of discrete and distinct objects or the frequency taken to be observed given some evidence is held hostage to an indeterminacy regarding how many objects there are and how many of them have the property under consideration. Of course this doesn’t have any bearing on the question at hand, but is a problem that has a tendency to crop up in other contexts when using this construction, and I like to be thorough.

In these terms the assertion that all crows are black is %(1)(Cx,Bx), assuming Cx is satisfies the other requirements just discussed. And the assertion that all non-black things are not crows is simply %(1)(~Bx,~Cx). Determining how well these kinds of statements are supported by evidence is something I have discussed previously, and so there is no need to delve into any substantial discussion on that topic here. It suffices to point out that, given how evidence lends support to these claims, that there is no necessary connection between how likely we deem %(1)(Cx,Bx) and %(1)(~Bx,~Cx), except when the evidence completely refutes %(1)(Cx,Bx). But that is not a problem, because the only way that can occur is if we find some object that is a crow and not black. And that evidence also refutes %(1)(~Bx,~Cx), and so this relationship follows immediately from the evidence, and we don’t have to worry that this it is an illegitimate instance of the very kind of logical connection that we were trying to avoid. This formalization also has a number of other advantages, but to conclude I’ll mention just one of them, namely that lacking any crows all statements of the form %(y)(Cx,Bx) are equally likely. Which means that, without any crows, we can’t say anything about how many of them are black, the evidence we collect doesn’t point to any conclusion. And this is exactly how it should be.

November 14, 2007

The Burden Of Proof

Filed under: Epistemology — Peter @ 12:00 am

The burden of proof is a commonly invoked principle, used to end an argument by pointing out that there is an unsatisfied burden of proof that falls on one of the parties, and thus that their position can be justifiably considered false unless they produce the required proof (actually, the required evidence, “proof”, while colloquial, is too strong of a term). Indeed we might even make the case that all arguments ultimately rest on some claim involving a burden of proof. Even the validity of a claim backed by a mathematical proof rests on the assumption that we aren’t all systematically making the same logical error when considering that proof, and must “refute” claims that we are making such a mistake by arguing that the person making them bears the burden of demonstrating that we are making such a mistake; people holding the proof to be accurate are not required to demonstrate that we are not making any such systematic mistake. But, despite its widespread use, there are many confused ideas about who bears the burden of proof and why. Partly, I suspect, this is because the notion is usually invoked only rhetorically, investigations need only produce the best explanation, they are not required to defend that explanation against other possibilities (whether we even should debate things or whether investigations should stand by themselves, to be compared to other investigations but not pitted against them, is something that I will leave aside for the moment, although I lean towards the latter). Because of its informal use people tend to assume that the burden of proof rests with the more outrageous claim, that which we deem the less probable or less rational. But clearly that can’t actually work. Not only is there a problem in establishing what these standards are, but once we have established them there is no need for further proofs; given any two such claims, plus the ability to pass such judgment, taking into consideration the available evidence, clearly warrants us in believing the less outrageous, the more probable and more rational claim by those standards, and there is no need to further bother ourselves with the burden of proof. So, rather than being dependant on such standards, the burden of proof is something we use to decide which alternative is more rational or more probable.

The natural next move is simply to assert that the burden of proof always lies with existence claims, such that the claim that something exists always bears a burden of proof in comparison with the equivalent non-existence claim. I would agree with this principle, in the vast majority of cases, but as presented it isn’t obvious why we should believe it. Furthermore, there are a number of other cases where it would seem that we need to invoke the burden of proof, but in which both alternatives involve existence claims. For example, we would say that the claim that the force of gravity is conveyed by gravity gnomes bears the burden of proof in comparison to the claim that the force of gravity is conveyed by gravitons, even if the gravity gnomes and the gravitons are claimed to have exactly the same observed effects. Now, we could try to cast the difference between these two claims in terms of existence, arguing that the first postulates the existence of gnome-like properties, while the second doesn’t. But, not only does that make the metaphysically questionable move of attributing existence to properties, but it opens up the problem of wondering whether gravitons are postulated to have not-gnome-like properties, thus putting the two on an even footing. Worrying about such negative properties is exactly why it is a bad idea to ascribe existence to properties, and I would rather avoid arguing that the claim something lacks properties does not imply that it has the property of lacking those properties. It is better then to simply look for a better understanding of the burden of proof than worry about such issues.

Let us return then to more fundamental questions before trying to say exactly where the burden of proof lies, and instead consider why the burden of proof even exists in the first case. Simply considering common cases where the burden of proof is invoked reveals that it is usually used to distinguish between a well-confirmed hypothesis and an “implausible”, to speak colloquially, alternative that simply hasn’t been disproved. Note that is not used to distinguish between two alternative hypotheses that make exactly the same predictions, and which are thus indistinguishable even in principle; those cases we can deal with using a principle of epistemic indifference, which states that if two claims make exactly the same predictions then they have the same content, and that any apparent differences are simply a linguistic confusion arising from trying to say “more than we are able”. Consider, for example, the unicorn hypothesis as compared with the no unicorn hypothesis and the god hypothesis compared to the no god hypothesis, U vs, ~U, G vs, ~G. In their natural, original, forms both sides make fairly strong claims and are relatively easy to verify. If we run into a unicorn or god then U or G are confirmed, but if we consistently fail to then ~U and ~G are confirmed instead. Thus in their original forms the U and G hypothesis are relatively quickly “disproved” and we may claim that we know ~U and ~G (even though the possibility of making an observation that confirms U or G can never be completely ruled out). However, those who firmly hold U or G to be true, for whatever reasons, will wish to revise their hypothesis so that they aren’t disconfirmed by the lack of such observations. Thus they may propose new, modified versions, the rare unicorn and rare god hypothesis (RU and RG), which claim that unicorns and god are around, but that observing them is extremely unlikely, as unlikely as needed in order to account for our continual failure to make confirming observations. And it may be claimed that RU and RG fit the evidence just as well as ~U and ~G do. It is here that the burden of proof comes into play. We say that, despite the fact that they both fit the evidence equally well, RU bears a burden of proof that ~U doesn’t, and similarly with RG and ~G.

And this is why we need the burden of proof, because such transformations can be made with any claim, we can always qualify our assertions in such a way that it confirming them becomes arbitrarily hard to do, so that they always fit with the observed facts. For example, we could claim that the laws of physics are wildly different from what they are normally supposed to be, but that this difference only manifests itself when everything present is green. And since it is virtually impossible to create this set-up, where even the instruments making the observations are completely green, this hypothesis can’t be ruled out in a strict sense. But to admit that we simply can’t rule such possibilities out and that we must treat them as equally likely, given the lack of evidence distinguishing the two, as some have a tendency to claim, is simply absurd. Because this is not like the cases of epistemic indifference, accepting one of the two possibilities may have serious consequences for our understanding of the world. For example, we might postulate that all boxes that are never opened and never inspected contain dead cats, which would have radical consequences for our other beliefs, because we would have to postulate new theories about how the dead cats got there, and so on.

With that in mind I can cut to the chase, and point out that, while the available evidence may fit two available hypothesis equally well, that doesn’t mean that it confirms both of them, and that thinking it does is a linguistic/logical confusion. The “burden of proof” is a device for pointing this out then, for pointing out that while a particular hypothesis is consistent with the available evidence that it isn’t supported by it, while the other is. This goes back to the problem of confirming “all crows are black”. As it has been pointed out this statement is logically equivalent to “all non-black objects are not-crows”. And thus it would seem that since observing objects of other colors which fail to be crows confirms the second it should confirm the first as well. Maybe that doesn’t seem absurd to you. But consider that such evidence would still confirm the hypothesis that all crows were black even if no crows had even been observed. And in that situation it would equally support the hypothesis that “all crows are blue” (and that all grues are green, for that matter). But clearly that is absurd, the same evidence can’t be legitimately taken to entail two contradictory hypotheses (at least, not if we want to retain the idea that the evidence lends support to the claim). I won’t get into how we are to avoid this dilemma; there are a number of solutions, some of which involve breaking the apparent symmetry by revising the original claim so that it contains an existence clause. It suffices to point out that to resolve the dilemma we end up with the solution that while observing a number of non black, non-crow object is consistent with the claim that crows are black that it doesn’t justify that conclusion. And so we can legitimately use that distinction when we say that some claim bearing the burden of proof means that while that claim is not contradicted by the available evidence neither is it supported by it, while the alternative is, even if we are unwilling to say exactly what being supported by the available evidence consists in.

And we can see how this understanding of the burden of proof justifies the way it is usually employed. For example, the hypothesis that rare unicorns exist bears the burden of proof over the hypothesis that unicorns don’t exist, because while the lack of observed unicorns doesn’t contradict that hypothesis neither is the claim that rare unicorns do exist supported by it. To be actually supported would require an observation of these rare unicorns. Similarly, the hypothesis that gravity gnomes carry gravitational force carries the burden of proof over the hypothesis gravitons carry the gravitational force, because the hypothesis involving gravity gnomes involves some additional claims regarding their gnome-like properties while that with gravitons does not (and I would gently remind the reader that claim that gravitons lack gnome-like properties is a claim only in name only, because it asserts nothing about gravitons over and above their role in conveying the gravitational force; to understand it as a claim is to fall prey to the very linguistic/logical confusion we are trying to avoid). Finally, to use a new example, if we come across 100 boxes and we open 99 of them, only to find that they are empty, the hypothesis that the last one contains a ball bears the burden of proof over the claim that it is empty. The sequence of empty boxes justifies the hypothesis that they are all empty, which justifies the claim that the last box is empty; it does not justify the claim that they are all empty except the last one, even though it does not contradict it.

« Previous PageNext Page »

Blog at WordPress.com.