On Philosophy

November 10, 2007

Convergent Methods

Filed under: Epistemology,Logic — Peter @ 12:00 am

For obvious reasons I am concerned with the method of philosophy, I desire to arrive upon some acceptable method for philosophy that we can recognize as being well-suited to producing philosophical theories. The reason this matters, and the reason this isn’t a trivial quest, is the acceptable part. Because of course everyone producing philosophical theories is using some method, because even constructing a theory by randomly rearranging letters until they make words is a method, just not a good one. Given that there are two reasons this seems pressing to me. One is that there doesn’t seem to be any consensus about what the method of philosophy is, which is rather absurd. It means, effectively, that two philosophers could disagree philosophically, because of differences in their methods, and simply have no way to resolve their disagreements, even in principle. Accepting this situation is tantamount to accepting an anything goes attitude towards philosophy, because any imaginable theory can be produced by some method. So if we simply accept differences in method without criticism we are essentially accepting every theory without criticism. The other reason this seems pressing is that many of the commonly used methods, such as conceptual analysis, seem obviously absurd (obvious in the sense that there is a disconnect between the method and the kinds of philosophical theories we would like to arrive at, given that conceptual analysis can only legitimately produce claims about our concepts and never about anything with an objective existence), and so repairs seem to be in order.

In many ways, however, it is easier to investigate method in general, rather than philosophical method specifically, and this has the added benefit of resulting in claims that are possibly useful in other disciplines where methods play a critical role. To really analyze methods we need to formalize them to some extent, so that we can look at the specifics of how the method works, rather than speaking in vague generalities. Let me begin then by dividing any method into three distinct parts: the input, the state, and the results. The input represents new pieces of information that the investigator, the person using the method, receives. In the physical sciences, for example, the input represents individual observations made. And the results are simply the current theory endorsed by the method. The state represents information used in the method but which isn’t reflected in the theories produced. The state, for example, might contain a record of past theories, past observations, and so on, all of which have an effect on what theory is endorsed by the method as a result of some new information. Obviously this state seems like an artificial construct, and in many ways it is, because it is simply one way to formalize methods in general; we could either consider each method as resulting in a theory given all the information received to date, or we can consider the method as resulting a theory given new information plus some record of what we have already done. If this way of working with methods in abstract doesn’t pan out we can always try the other way if the problems seem to be stemming from this choice.

Of course to actually get anywhere beyond this vague outline and to formalizing actual theories would require some way to formally represent the content of a theory, the information we receive as input, operations on theories, and to determine how similar two theories are (for reasons that will become apparent momentarily). Naturally I don’t have any of that apparatus at my fingertips, however we can consider methods that operate purely on numbers and make some headway in analyzing methods in general by working with these more limited examples. Every method then can be defined by three functions. The first is the input function i(k), which yields the k-th input. Next we have the state function, fs, defined as fs(k) = g(fs(k-1), i(k)), fs(0) = s0, where the function g and the s0 value will obviously be method dependant. And, finally we have the results function, fy, defined as fy(k) = h(fs(k-1), i(k)). And so by defining the functions i, g, and h and s0 we define our method and can examine how it behaves as more and more information is received (as k increases).

As discussed previously one standard we would like our methods to meet is for them to be universal, they must yield the same results for everyone. Obviously this depends in part on the input function, even if we don’t receive the same input in the same order we must all be able to have the same information in the ideally long run. But that is an epistemological issue that can’t be settled by examining the method by itself. However, we can determine whether the method necessarily converges to a single value in the long run (as k → ∞) for certain classes of input functions (for example, the same input function but with the values rearranged). If it doesn’t then it is a flawed method, because it implies that it could lead us to different people using that method to different conclusions even if they had essentially the same source of information, and that violate the assumption of universality.

Let’s consider some examples. We will consider two methods, both of which have g(x,y) = max(x,y), h(x,y) = max(x,y), and s0 = 0. However, for our first method we will let the input function i(x) be defined by selecting a value randomly for each x value, with probability .5 of value 1, .25 of value 2, .125 of value 3, and so on. The class of input functions we are considering then is naturally every possible function generated by that method. And for our second method we will let the input function i(x) be defined by again selecting a value randomly for each x value, but this time we pick a value from the reals in the range (0,1), with each real having an equal probability of being picked (for clarification, the values 0 and 1 themselves are not in this range).

It should be intuitively obvious that the first method does not converge to a single value, because as we consider more and more inputs we will always find larger and larger values, although they will, on average, be spaced farther and farther apart. On the other hand, our second method does converge towards 1, although it too constantly yields larger and larger values, they just never exceed 1. But how can we prove this, because clearly in interesting cases it may not be obvious whether our method converges. We can steal a relatively standard formula here and assert that the method converges to some k if and only if:
∀ε>0∀γ<1∃x∀y>x P(k-ε < fy(y) < k+ε) > γ
This asserts that we can pick an arbitrarily small ε and some arbitrarily high probability γ (although not a probability of one) and we can find some value such that the result yielded by our method after that value is more than γ likely to be within ε.

This allows us to disprove that our first method converges simply by observing that for any k we might pick there is always a finitely large probability of coming across a larger value, and thus that γ cannot be arbitrarily “tightened”, and so that it doesn’t converge for any value. And our second method can be equally easily proven to converge to 1, because for any ε we might pick there is always some probability that the input will yield a value larger than 1-ε. And the probability that we will come across such a value increases towards 1 arbitrarily closely, and so no matter what γ we pick we can always find a suitable x, although it may be very large.

But those were relatively easy cases (and designed as such), let us now consider something slightly harder. First, however, I must define the encoding and decoding functions. The encoding function <x,y> encodes two values into a single number, and the decoding function (x)y extracts those numbers, such that (<a,b>)0 = a and (<a,b>)1 = b. For our method we will define g(x,y) = <((x)0*(x)1+y)/((x)1+1), (x)1+1>, h(x,y) = (g(x,y))0, and s0 = <0,0>. Since what this does might not be obvious I’ll explain in words. The state of the function is a pair of numbers, the first of which is the average input so far, and the second of which is the number of inputs processed so far. Upon receiving a new input the method yields the average input including that one. And for this method we will define the input function i(x) by randomly selecting a value from the integers between one and ten, inclusive, each with equal probability.

Again, it is intuitively obvious that this function converges, to 5.5 specifically, but it is much harder to prove that it does so. First of all working with it as is would involve some fancy footwork involving computing the probability of any given sum, and so on. To simply our task we will pretend that the input consist of only two values, those over 5.5, the highs, which are of value 7.75, and those under 5.5, the lows, which are of value 2.25. Obviously this approximates the actual input, but a complete proof would need to prove that it does so. Now, consider an arbitrary ε and this method run over g inputs, also arbitrary. Assume that as a baseline the highs and lows are equal in number, and thus that the average is exactly 5.5. How many additional highs would be required to make the average greater than ε+5.5? That is something we can calculate. Let k be the number of additional highs.
(5.5*(g-k) + 7.75*k)/g > 5.5+ε
5.5*(g-k) + 7.75*k > 5.5*g+ε*g
7.75*k > 5.5*k+ε*g
2.25*k > ε*g
k > ε*g/2.25
Thus we have more than ε*g/2.25 highs in order to be farther above the average than ε. The next step is to calculate how probable finding a particular ratio of highs to lows is for an arbitrary g. Fortunately for us this is relatively easy, the probability is .5h*.5l, where h is the number of highs and l is the number of lows. Now what we need to do is calculate the total probability of finding ε*g/2.25 more highs than lows. For any particular number, x, more or less the probability is .5g/2 – x/2*.5g/2 + x/2. To get the probability of finding ε*g/2.25 more we must integrate.
\int_{\frac{\epsilon*g}{2.25}}^g \small.5^{\frac{g}{2}-\frac{x}{2}}*\small.5^{\frac{g}{2}+\frac{x}{2}} \,dx
Fortunately this is relatively easy to do, and yields:
.5g*(g-(ε*g)/2.25)
This is the probability that the average will exceed 5.5+ε after g inputs.
Thus the probability that the average will be within ε of 5.5 after g inputs is 1 – 2*.5g*(g-(ε*g)/2.25).
Fortunately this obviously approaches 1 arbitrarily closely as g increases, thus proving that the method converges to 5.5. That was a lot of work for such a simple method, and thus reveals that if we are going to get anywhere substantial with this kind of analysis what will be needed is general rules, which state that all methods with certain features converge, because otherwise we will face the extremely difficult task of proving convergence in ever more complicated cases.

Another interesting fact that this investigation reveals is that whether certain methods converge may depend on the input being fed to them (as was the case with our first two examples). Obviously the input is something we can’t control, and given our epistemic situation we can’t even say what general constraints the input obeys, since there is always the chance that what we have been given is simply a very unlikely sequence. This would imply that for some methods in some cases we simply couldn’t know whether they converged. To overcome this problem we might build in a “sanity requirement” into our methods. Generally it will be possible by examining the method to determine what kinds of inputs it will converge for. We can thus use the state of the method to record what inputs have been seen so far and, at every step, determine what kinds of converging inputs are possible and how likely it is that the values being yielded will vary as they do if we are actually dealing with such an input. If the method is varying in ways that seem extremely unlikely given an input that will produce convergence we might have the method produce a result that indicates that it the input is invalid, and will thus converge on this “invalid” result. For example, given the averaging method discussed above we might build in a sanity requirement that yields this invalid answer if the average value shifts too much after a large number of inputs. Obviously such shifts aren’t impossible, it might be that the initial run so far has been highly improbable or that we are encountering an extremely improbable sequence somewhat earlier than might be expected. And so guaranteeing that the method always converges when the nature of the input is unknown may sacrifice accuracy, as there are rare situations where the method will now converge to the invalid result when previously it would converged on an actual result. But, since we can make these situations arbitrarily improbable, this is an acceptable tradeoff for being able to handle input sequences where convergence is impossible.

November 6, 2007

Sources Of Information

Filed under: Epistemology — Peter @ 12:00 am

To have a source of information is to have access to something, and thus that there is at least the possibility of using that information to make reliable judgments about it. The reason sources of information should be important to us is twofold. First there is obviously a strong connection between the ability to have knowledge about things and to have information about them, as it seems impossible to have knowledge without having information (which is not to preclude the possibility there may be other factors that turn that information into what we would consider knowledge, although that is a matter for another time). Thus the ability to identify our sources of information seems like an important component in any evaluation of the way we reason, because knowledge that doesn’t stem from them in some way isn’t really knowledge about anything at all. Secondly it also seems that having sources of information is what allows us to meaningfully refer to things. Specifically it would seem that we can only successfully make reference to something by pointing at it though our information (by identifying it as the cause of that information). Without the ability to pin down what we are talking about in this way our words become completely general (they could be about anything that has the right relationships), and thus describe a pure formalism. And this has important epistemological implications as well, because what a pure formalism can tell us, and how we evaluate them, is different from claims that are about some subject matter.

This brings us to the central problem, identifying the sources of information available to us. Obviously we can’t “prove” what we do and don’t have information about. To prove something, in the strict sense of a deduction from simpler principles, presupposes certain sources of information, sources that justify the premises of the proof and the claim that the inferences involved are valid inferences to make when the domain under consideration is sources of information. As with many other places in epistemology what we must do is consider the claim that we have a particular source of information as “evidence”, a claim that does not itself need justification to be used as the basis for further claims. However, like all evidence, this claim itself must be falsifiable, it must be possible to reject a proposed source of information. And that is how we go about seeing what sources of information we have, we can hypothesize that whatever we like is a source of information, but we must then critically evaluate that source. The way to perform that evaluation is by testing the consistency of a proposed source of information, the assumption being that if a source of information is consistent then it is likely that there is something causing it to be consistent, and that this something, whatever it is, is what the information is about. Because if this information is not in fact about something, but rather comes from a void (or a random process of some kind) then the probability that it will remain consistent as more and more information is drawn out diminishes, and thus in the ideal infinitely long run such a way of evaluating our sources of information will leave us only with the those that are actually sources.

Let us consider then our senses as a source of information. Certainly our senses seem consistent. Obviously we sense different things at different times, but the changes seem to follow predictable patterns, and thus we are fully capable of interpreting our senses as revealing information about an external world which is generally consistent over time, allowing for changes in that world. And there is a further consistency between what we sense and the reports of other people (remember the reports of other people are further evidence of consistency, they don’t prove anything, and so we are not unjustifiably assuming that other people are real). But, of course, our senses are not perfectly consistent in this way, as is demonstrated by phenomena such as optical illusions. And this might seem to invalidate our senses as a source of information by our own criteria. But, when it comes to these “inconsistencies” we possess explanations of them, both in the sense that we can predict and thus compensate for them, and in the sense of knowing reasons why they occur. Of course these explanations are themselves justified by our senses, but this doesn’t invalidate them. What it shows is that our senses really are consistent, despite what might seem like some apparent inconsistencies in them, it’s just that the way in which they are consistent is more complicated than we originally supposed. A real inconsistency in our senses would have to be something that we couldn’t explain and thus was essentially unpredictable, and if they existed they would throw all the information we derive from our senses into question, because there would be a real possibility that any information we derive from them is really a product of one of these unpredictable deviations. It would probably drive us mad.

Similarly, we seem to have information about our own mind. As with our senses, our access to our own mind doesn’t reveal the same internal state to us at all times, but the internal state revealed in this way seems to change in generally predictable ways. Of course this might seem to presuppose the accuracy of our own memories, possibly illegitimately since our ability to recollect things is part of our minds. But, again, it is important to point out that what we are looking for here is only consistency, and not a justification. Our only goal is to examine our memories, and our mind in general, for consistency, not to prove that they are accurate. Obviously when it comes to our own minds we can’t compare our observations of them to those of other people (since they don’t seem to have access to our mind), but this doesn’t mean that this source of information is deficient in some way, it simply means that we have fewer opportunities to detect inconsistencies. I would also like to note here that even though we can consider ourselves to have information about our own minds this doesn’t mean that the mind is as the information seems to portray it, just as the external world isn’t necessarily as the senses portray it. The natural interpretation of the senses is as revealing an external world that is essentially continuous, but we know that it is actually made of discreet atoms. Likewise just because our access to our own minds is naturally interpreted in terms of qualia or “feels” doesn’t mean that such things actually exist, it just means that they reflect something that does exist.

Now let’s turn to a more interesting and controversial case, the possibility that we have access to information about mathematical objects or some mathematical realm through mathematical intuition of some kind. Certainly there are domains in which it seems that our mathematical intuitions are relatively consistent both internally and with the mathematically intuitions of most other people, namely logic, arithmetic, and possibly geometry. However, inconsistencies with these intuitions are certainly possible, and sometimes even mathematically profitable. For example, it is possible to construct internally consistent logics that defy our mathematical intuitions about truth and to use them productively. More significantly there are a large number of cases in which our mathematical intuition is essentially silent, where both of two mutually exclusive possibilities are consistent with everything that is mathematically intuitive to us, such as the continuum hypothesis or the axiom of choice (although the two are not independent of each other). And, on these issues, it is even possible for the mathematical intuition of different people to pull them in opposite directions. Now such inconsistencies do not themselves necessarily refute mathematical intuition as a source of information. The real problem is that no justified explanation can be given for these inconsistencies, as we could with the apparent inconsistencies of the senses. Certainly we can hypothesize reasons, but none of them are backed up by anything besides the necessity of coming up with some explanation (nothing in our source of information about the mathematical world, or about our minds, or the external world, has anything to say about our connection to the hypothesized mathematical world or its limitations). This refutes mathematical intuition as a source of information, unless insights from some other source of information can validate it and explain away the apparent problems. (Of course it doesn’t rule out the possibility that mathematical intuition is a source of information about something subjective, but if it is construed as such it is only an addition to the information we have about our own minds, specifically it would be information about the psychological sources of mathematical intuition.)

What the case of mathematical intuitions brings up is that we need some way to distinguish between genuine sources of information and instincts. Because instincts may be apparently consistent among people, but that doesn’t mean they provide information about anything except what was of survival value in the past. For example, we could imagine that all people are born with the belief that “X is Y”, but such a universal belief does not reveal any information to us about this “X”. The most obvious distinction between the two seems that a source of information can always divulge genuinely new facts, but instinct (and thus intuition) seems limited to a finite number of facts or general rules. Thus I would explain our mathematical intuition as a kind of useful instinct that leads us to believe in certain general rules about inference, addition, and so on. But it is obviously not a source of information, because when we get beyond those general rules, to things that are independent of them, our mathematical intuition is silent or inconsistent.

As a final note I would like to say a few words about how taking the claim that we have a certain source of information as evidence integrates with my previous assertion that we should treat particular pieces of information derived from such sources themselves as evidence. It might seem that saying we have a particular source of information would be to entail the truth of the information derived from it, thus making treating those specific claims as evidence redundant. But this is not actually the case. Saying that we have a particular source of information is not to make a claim not about what is and isn’t the case, but about reference; it is to say that there is something that the information from that source reaches out to. And just because certain claims are produced by a source of information doesn’t mean that they are correct. As was pointed out in the case of our senses the claims produced may occasionally contradict each other, at least under their natural interpretations, and further investigation is needed to determine which are in error and why. And thus claims stemming from this source of information may indeed turn out to be false (those false claims still convey information, just not the information they were originally thought to).

November 4, 2007

The Trouble With Explanations

Filed under: Epistemology — Peter @ 12:00 am

I would characterize every discipline that aims at investigating some subject matter as desiring to produce an explanation of that subject matter, to categorize it and devise rules that describe its internal structure and its relations to other things, as well as how it changes over time. Obviously what we are studying and how we study it varies from discipline to discipline, and thus the nature of the explanations produced will vary, but anything that can be said about explanations in general applies to all the disciplines involved in producing them. If there is some best way to arrive at explanations it must apply to all these disciplines, and if certain kinds of explanations are invalid they are invalid no matter where they are invoked. I bring this up simply because the examples I will be providing will primarily be concerned with philosophical explanations, since those are the kinds of explanations I am most familiar with, but unless philosophers are extraordinarily unlucky I expect that the kinds of examples I am using here could easily be replaced with developments in the history of physics or economics.

With that said allow me to proceed to my central claim, which is that when we search for explanations we generally begin our search with expectations about the form that the explanation can take. And while sometimes these expectations may be useful, in steering us directly towards the best explanations, at other times they can completely prevent us from successfully explaining what we intend to. Such tendencies result in the kinds of explanations that simply add more epicycles rather than approaching the problem in new ways. And in the history of astronomy the Copernican revolution is an example of breaking away from flawed expectations about the form that explanations can take in order to arrive at better ones. Before Copernicus the job of astronomy was to explain how the planets revolved around the earth. And obviously if you are trying to develop such an explanation the idea that the planets might revolve around the sun will never even enter your mind, that the planets orbit the earth is a kind of assumption which you proceed to cast your explanations in terms of. The Copernican revolution can thus be understood as shifting what needed to be explained from the orbit of the planets around the earth to observed planetary motion. And when trying to explain how the planets are observed to move we are no longer required to make them follow paths around the earth. Now this is not to say that before Copernicus any astronomer ever sat down and decided that the earth being at the center was something that they were going to take for granted, they took it for granted without realizing that the assumption was not necessarily true. And that is what makes such expectations so insidious, because they lead us to effectively reject whole categories of possible explanations without actually seriously considering them.

Given that our expectations with respect to what kinds of explanations are acceptable can be problematic we would like to know when these expectations are actually interfering with our theorizing. Unfortunately, by their very nature, they are essentially unconscious, meaning in this case just that we don’t think about them as assumptions, we operate on the basis of them without passing judgment on them. But there are some indirect clues that we can pick up that can at least awaken us to the possibility that such expectations might be interfering with our judgment, even if they won’t point them out directly. One such clue is when we find ourselves dealing with problems that seem insoluble, by which I mean not that we are having problems solving them, because any truly interesting problems are hard to solve, but when we find ourselves facing problems that have resisted solution for a significant amount of time. With such a problem it might very well be the case that it has resisted solution for so long because those investigating it, us included, have shared some common assumption about what the explanation should be that has prevented us from arriving at an actually satisfying explanation. Another sign that we may be suffering from illegitimate expectations about the form the explanation should take is when the explanations that seem best all raise further problems which themselves have no obvious solution. Consider, for example, the desire to know why the universe exists. Since the explanations of why things exist are often causal explanations some may insist that to answer that question we must reveal the cause of the universe to them. But to give an explanation in such terms leads to an infinite regress, because if everything must have a causal reason for its existence so to must whatever we posit as the cause of the universe. Thus the explanation simply raises more things that must be explained. Or consider the proposal that mathematical entities, such as numbers, are objects. Such a proposal solves certain problems, but it raises the problem of how we have access to these objects. And if some special mental faculty is supposed we must then ask how that faculty manages to provide us with reliable information (since all such information supervenes on causal relations), and so on. Such explanations prove equally unsatisfactory because they simply move around what needs to be explained (rather than, for example, explaining it in terms of things that have already been satisfactorily explained). Again, such an explanation is not really an explanation, but rather a sleight of hand, which pretends to be an explanation simply by kicking up dust and so hiding what is in need of an explanation. When we find ourselves dealing with explanations that hide the problem rather than solving it this is an indication that there is something preventing people from legitimately solving it, quite possibly poor expectations about the explanation itself.

Suppose then that we become convinced that certain expectations are preventing us from formulating satisfactory explanations, or that we suspect they might be. To help discover what these expectations might be it is helpful to consider where they often stem from. One source of these expectations is simply the assumption that what worked for similar cases in the past will work for the case we are confronted with now. For example, scientists were baffled by certain phenomena in the early 20th century because they couldn’t explain them in terms of Newton’s laws. Such mistakes are, however, the most innocent (because that is essentially how we should proceed), and thus often the easiest to overcome. The more problematic assumptions stem, I think, from the way we talk about phenomena, and thus, I suppose, the way we naturally find ourselves thinking about them. For example, we often find ourselves talking about phenomena of all kinds as things, because that is simply the way language works, the most natural way to talk about something is to label it with a noun so as to talk about that thing. This leads to questions such as “what kind of thing are numbers?” which, as mentioned above, simply can’t satisfactorily be answered because numbers are not things. But, thinking about them as things, may lead some to believe that we need to find them or, like other things, their failure to exist will imply that they are fictions and thus completely subjective. Without the expectation that numbers are things there is no reason to be led to such conclusions. Another way to develop faulty expectations is simply to assume that our linguistic judgment is the ultimate and best way to judge things in a certain situation. For example, it is a faulty expectation to demand that whatever our explanation of goodness says is good will be intuitively judged to be good by us. And, finally, our language may mislead us about what needs to be explained, because if we develop a label to describe some phenomena we may think that this label itself needs explanation. But it is quite possible that the correct explanation may simply reveal that we have incorrect expectations about this label and what it signifies, such that in many ways it might be considered an illusion. The best example of this is the demands of some that we explain qualia (what things “feel” like), and that these explanations somehow expose the qualia themselves to us. But it is quite possible for the correct explanation in this case to be an explanation of why we think about our own consciousness in such a way that there are qualia. To insist that qualia really are something of their own is to presuppose something about the form of our explanation, when all we are really required to explain is our judgments about qualia.

Let us further suppose then that by means of such considerations we have isolated certain flawed expectations that are preventing us from providing satisfactory explanations, and indeed I have identified quite a few candidates already where our expectations might really be interfering. How do we overcome these expectations? The best way may simply be to provide a better explanation that doesn’t conform to them, and then hope that the clear superiority of this explanation will reveal the expectations themselves as flawed. This is the way the physical sciences tend to proceed, and they are bothered least by faulty expectations. But the rest of us may not be so lucky, and may have to supplement our explanation with reasons to reject the expectations themselves. And one way to go about doing that is to illustrate how the expectations arise and then argue that such sources can have no claim to legitimacy (usually this is easiest to do when the root of the expectations is linguistic). The problem is, unfortunately, that not everyone will be convinced by such arguments. In philosophy, for example, intuitions are very heavily leaned upon (although they are often obscured in some way so that those leaning on them can claim that they are not intuitions). And these expectations are problematic precisely because they are intuitive, and so an explanation that goes against them will be unintuitive by necessity. My solution is simply to reject doing philosophy on the basis of intuition of any kind, and thus if someone should complain that we have said something unintuitive we can simply shrug our shoulders and move on, because we simply can’t satisfy everyone’s intuitions.

But the best solution to all of these problems is to avoid them from the outset by describing the problems we are trying to solve, so that the possibility that significant expectations may slip in is reduced or eliminated. Which means that the way in which we describe our problems must be free from any presuppositions about the form the solution may take or things it must involve. The simplest way to do that is to phrase our problems only in terms of what we can observe, at least as much as we can (with the idea that whatever we cast our problem in terms of that isn’t a matter of observation may turn out to be problematic). If we are trying to explain why certain lights in the night sky move as they do then this presupposes nothing about them or our explanation. They may be planets moving around the earth, or around the sun, or they may simply be persistent delusions. And when we can’t cast a problem in such terms it reveals that we are including something problematic in it, something that we must question the legitimacy of.

October 26, 2007

Evaluating Unknown Methods

Filed under: Epistemology — Peter @ 12:00 am

Suppose that there is a completely sealed black box that you cannot touch or interact with in any way except by looking at it. And further let us suppose that a method for determining facts about the contents of the box drops your lap, either a deductive one, that starts from the information you have access to and produces claims about the contents of the box, or a hypothetical one, that you feed claims about the contents of the box and which it proceeds to either validate or refute. Or suppose that you don’t know how addition works, that you don’t know even whether adding two numbers is supposed to produce the number that counting two collections of objects of those cardinalities as a single collection results in. And, similarly, a method is given to you for adding two numbers together. How, in these cases, are we to evaluate the methods that we happen to come across? Obviously if we already had another method for determining the same facts we could compare the two, either formally by proving in some way that they must produce the same results, or by simply comparing their results in a large number of cases. But that is not the situation we find ourselves in here, here we are really in the dark. And indeed the study of anything must find itself in this situation at some point, because there must be a first method (because the history of people using methods to uncover these facts doesn’t extend infinitely into the past). So at least some real people have found themselves in this situation. How did they extricate themselves from it?

Now just because we don’t have a method for determining the answers we are pursuing in the general case doesn’t rule out the possibility that we know some specific facts by other means. For example, we might know that there are only five objects in the box, or that 2+3=5. Obviously if the method fails to agree with any of these specific cases then we know that it is a flawed method, at least in some situations. And since we have no way of determining its accuracy in any of the innumerable other cases which we know nothing about this may motivate us to set it aside as a way of knowing the answers we are interested in. But still, even if does not contradict these specific cases, this does not mean that it is correct in all cases, or even most of the time, although it does lend at least some credibility to that possibility. The number of cases we know about in other ways is obviously finite (given our assumptions), and possibly quite small in number, or even nonexistent. However, this doesn’t exhaust the knowledge we may already have about the subject of our investigations. Besides knowing specific facts we may also know certain general properties or bounds. For example, given the fact that we can observe the box from the outside we know that there are limits as to the size and volume of whatever is within it. And when it comes to addition we may know that certain laws hold, such as the associative law. Knowing such general limits allows us to check every claim made by the method, determining whether it conforms to these properties. Still, even if the method doesn’t violate these properties this doesn’t ensure that it is correct either. It is fully possible for the method we are handed about the box to still be in error about its contents, just in ways that don’t violate the size constraints. Similarly, there are many mathematical operations that are associative, and so just because the method obeys this property doesn’t mean that it is doing addition.

Such tests exhaust our ability to use what we already know to evaluate the method. But these are not the only resources available to us. There are ways of evaluating the methods simply by the results they produce, without appealing to any knowledge about what the results should be in the intended cases. One such way is by examining the coherence, consistency, and universality of the results. Examining the coherence of the results means seeing whether the claims made by the method can all “fit” together, or whether some contradict the others. For example, if our method tells us that there are five objects in the box in one application, that all the objects are cubes in another, and that there are four cubes in the box in a third, then we have a contradiction. Clearly not all of these claims can be true at once, and so we have discovered that at least one of them is false. This allows us to discover a flaw in the method without actually knowing which results are true and which are false, which is quite elegant. Of course it is possible that the contents of the box are in some kind of rapid flux, but we can, for the most part, rule this out by using the method over and seeing if it still produces those same results. Which brings me to consistency; we expect that the method will produce the same results with repeated applications. Of course in the case of the box there is the possibility that the contents may be changing, but we can reasonably expect that the contents are changing at some rate, and so that applying the method in rapid succession will result in answers that are closer to each other than applications that are farther apart in time. So while such inconsistencies aren’t definitive in the case of the box they may be suggestive. Arithmetic, on the other hand, we expect to be completely consistent (perhaps this is another one of those properties that we know about it beforehand), and so any inconsistency would seem to show that the method produces some errors. Finally, we have universality, which means that we expect the method to produce the same results for different people, assuming that it is used correctly by them. Again, the failure to be universal doesn’t definitively show the method to be failing in these cases, perhaps we are all working with slightly different information, but it is suggestive (especially since multiple people can use the method at the same time, which helps determine whether any inconsistency is caused by actual changes or by a failure of the method).

Again, even if the method in question is coherent, consistent, and universal this doesn’t prove that it is correct. Another way that we can test our unknown method is to see how it performs in situations other then where it was intended to be used. For example, we can use the method that is supposed to reveal the contents of the box to us on another box, one that we do have access to. There are three possible results of this test. If the method works flawlessly in these situations then it provides evidence, strong evidence, that the method is a good one. Or the method may simply fail to be applicable, the process by which it proceeds may simply not be possible when considering other boxes. Again, this lends some credibility to the idea that the method is a good one (although less credibility than working perfectly would), because it implies that the method is really proceeding on the basis of some connection to the subject matter that doesn’t exist in these cases. Finally, it may produce the wrong answers, which makes it less likely that the method is a good one, although it doesn’t rule out the possibility (it is possible that the method is simply very poorly designed). Obviously testing the method for addition in this way is a bit harder, because there is nothing that we could apply it to outside of numbers that it could produce a correct result for. Still, we might try “adding” a shoe and a bottle using it and seeing what happens. If the method produces any result at all this would make it quite questionable (since it is hard to conceive of methods that can work with both shoes and numbers without proceeding by simply by disregarding their contents; a way of adding that simply always yields the same result, which would have passed all our previous tests, may be revealed as questionable in this way). As a side note I would mention that I often test philosophical methods in this way, seeing what they would do with the question “what is water?” If they agree that water is H2O (producing the answer themselves, not by simply refusing to argue with science) or reject the question as not properly philosophical then they pass, otherwise I toss them out as flawed.

At this point we might also perform a sanity check (that’s a technical term) on the results of the method. Do we understand the claims produced, or are they completely opaque in meaning to us? Either of the methods could produce claims that contain words that are completely undefined, but pass all the other tests we have put them to. A method that produces consistent nonsense has no problem when applied to situations other than its intended application, because we have no way of determining whether that nonsense was right in those other situations, even if we know all the facts about them. For example, if we examine our safe using the method and it produces the claim that “gleeb is grue” we simply can’t tell whether to accept or reject that answer on the basis of the contents of the safe. I maintain that claims that are opaque in meaning are really pseudo-claims, they seem like they are claiming something but they really aren’t. Thus a method that produces them isn’t really producing claims at all, and hence isn’t really a method, which certainly warrants us to reject it as a method.

Finally, we can examine how the method itself works. When we pick apart the method we are looking to inspect where the answers come from. For example, in the method for addition we will find a set of rules for operating on two numbers. From the nature of these rules we can in turn deduce a number of general facts about addition, if the method is correct. Now in the case of addition this may or may not help us settle the matter. If we know certain facts about addition to begin with we may be able to prove the correctness of the method. But if we don’t then all this inspection will reveal to us is that it defines some mathematical operation. And in fact this, I would claim, is really all there is to the matter in this particular case, if we have decided so little about how addition works to begin with then there really is no fact of the matter yet about what addition is for us, and we can choose to simply accept this method as defining it. When it comes to the method for determining the contents of the box things are not so open. By inspecting the method we expect to find at least some possible way for information about what is in the box to make its way to us through the method. If the method doesn’t even contain the possibility of such a connection (if it produces its answers by looking them up in some specified table, for example) then we can conclude that it is not a method for discovering what is inside the box, and that if it is right it is only right because whoever dropped the method in our laps had some other way to determine the contents of the box and put that information into what we thought was a method. But usually such an inspection reveals at least a possible connection between the contents of the box and us. For example, the method may work by measuring air currents, which implies that the contents of the box affect the air around the box. Or it may proceed on the basis of intuition, implying that we have a mental connection to the contents of the box. Given these connections we can produce a judgment about the method based on what we already know about how information and knowledge works in general. Is it possible that the contents are affecting the air around the box, is it possible that we have a mental connection to its contents? While such considerations aren’t definitive, because this may be the case that shows our previous judgments about these issues to be in error, they are strongly suggestive.

October 15, 2007

Claims And Experience

Filed under: Epistemology — Peter @ 12:00 am

Consider a claim about some subject matter. How do we know that this statement actually is a claim, that it actually asserts something about the subject matter? This may seem like a pointless question, after all don’t we just know, upon seeing a claim, that it is a claim? But the point of this investigation is not to determine what is and isn’t a claim in perfectly precise terms, the point is to explore how claims work, how they are connected to their subject matter, which in turn will limit the kinds of claims that we can make if we want to actually assert something.

Obviously what makes a statement a claim that manages to actually assert something about its subject matter is not something that results from its form. When considering animals, for example, if I assert that all X are Y in the absence of some definition being given to X and Y in this context I fall to actually assert something about animals. And this failure to assert does not stem from the fact that X and Y are undefined. We could easily provide definitions for these two terms, such as X is the cause of Z and Y is the mirror image of Z, and these definitions do not help the claim assert anything about animals, despite the fact that the terms are now all defined.

What is missing in this absurd example is any kind of connection between the claim and the subject matter itself. It’s not clear what X and Y indicate about animals. X and Y need to be such that they have some implications for the subject matter. In more technical terms we could say that what is also required is for us to be working under some rules for reference that allow us to put statements involving X and Y into a kind of correspondence or equivalence with the subject matter itself, meaning that such statements reflect a way that the subject matter could be. But now we face a new problem, constructing the rules for reference. We don’t have direct access of any kind to the subject matter (despite the fact that we often think of ourselves as having this kind of access), and it doesn’t look like we can construct a set of rules for reference without such access.

To overcome this problem we must first recognize that rules for reference can be constructed out of other rules for reference. For example, if we already had F and G and rules for reference that put F and G in kind of correspondence with certain properties of the subject matter then we could define H in terms of F and G (for example, by asserting that H is the case when either F is the case or when G is the case). This creates a rule for reference that connects H with F and G. And since F and G are connected by rules for reference to the subject matter there thus exists a compound rule for reference that connects H to the subject matter. Thus all we need is some primitive rules for reference connecting some conceptual formalisms to our subject matter and then we will be able to assert things about that subject matter. If we are to uncover such primitive rules for reference I think we need to consult the means by which we have (indirect) access to the subject matter in the first place, experience. I hold experience to be our only means of access to any kind of subject matter simply because I cannot conceive of a way to have access to anything without it. Even if we suppose that our subject matter is something that we have direct intellectual access to (such as the world of forms) the only way to gain information about it is by experiencing this direct intellectual access (perhaps by experiencing our concepts or intuitions). We are only conscious of anything through experience of some kind, and so to assert that we don’t have access to the subject matter through experience is to assert that we aren’t conscious of this access, at which point it becomes hard to see how we can make claims about it when we aren’t even conscious of the fact that there is anything to make claims about.

Experience itself, I claim, provides primitive rules for reference, and from these we can construct all the rules for reference that we need. What experiences refer to is naturally a matter with some complications, which a detailed account of perceptual intentionality should uncover. But for our purposes here it is enough to describe the rules of reference between the experience and the subject matter as the experience corresponding to whatever is the source of that experience.

This chain of reasoning seems to be leading us to conclude that asserting something about some subject matter is intimately connected with experiences of some kind, such that every assertion will involve some claims about experience or possible experience. Obviously this sounds a lot like the logical positivist project, so before I add more details to this account let me say a few words about why the logical positivist project failed and why this account doesn’t suffer from similar problems. One problem with that project was that it was connected too closely with formal logic. Instead of thinking about how statements might be transformed into assertions about experience and the subject matter the logical positivists assumed that the complete theory would logically imply them. And logical implication is problematic because it doesn’t really reflect the kind of correspondences we are dealing with. But such problems could often be swept under the rug with enough qualifications. The real problem with the project is that it made no distinction between the rules for reference and the assertions themselves. The rules for reference are essentially a formalism, they aren’t claims about some subject matter. By treating the rules for reference and the claims themselves in the same way and trying to explain the relationship for the claims to experience in terms of logical implication the theory was overburdened, and collapsed under internal contradictions.

Of course the rules for reference may seem like claims. For example, to claim that X gives rise to certain experiences seems like a claim about the properties of X. What this is a manifestation of is that we only need one set of rules for reference to pin down X so that we can make assertions about it, meaning that we could pin down what X refers to in terms of experiences directly or in terms of simpler terms which are themselves well-defined. Once that it done any further rules for reference can be treated as claims, and evaluated by seeing if they conform to X as described by the rules of reference that we are taking as pinning down what X asserts about the subject matter. In an absolutely formal setting we would set up our rules for reference, set them aside as special, and then move on from there. But in real life what the rules for reference are may be subject to change, sometimes depending on the situation. For example, a kind of tree may have once been pinned down by rules for reference that connected it to certain ways trees could appear. But now particular kinds of trees are pinned down by rules for reference in terms of genetics, which are in turn pinned down by rules for reference involving specialized instruments. And thus the old rules for reference are now a claim about that kind of tree, namely that they generally develop to be trees with that kind of appearance. But despite the fact that the rules for reference may be in flux, and that what we assert with the use of a term may change as our understanding of the subject matter develops, what is asserted by a claim at any particular time must be fixed by some rules for reference, and those rules for reference are not claims.

I guess I should also briefly say something about how general terms can be pinned down by rules of reference that are essentially grounded in experiences that refer to singular occurrences. Since I have discussed how this might be accomplished previously when discussing language I will be brief. The short story then is that when we are defining our concepts in terms of experience we don’t have to define them solely in terms of the experiences we have had, we can define them in terms of possible experiences, as asserting that certain experiences are possible or impossible. And that allows us to define “kind” expectations in terms of experiences, a claim that certain experiences which distinguish the members falling under the supposed kind in some fundamental way as being impossible, or that the experiences we have about each entity falling under that kind will share certain similarities.

Finally I would like to point out that the claims developed here about what is required to assert something regarding some subject matter (rules for reference that are ultimately in terms of experience) essentially amount to a formal justification of my earlier claims regarding epistemic indifference (that if the difference between two claims doesn’t impact our experiences in any possible way then the difference between them simply doesn’t matter, or, in other words, the only things that matter are how claims relate to our experiences). Obviously epistemic indifference is motivated more by pragmatic concerns, and is independent of any theories regarding the nature of claims and their relationship to the subject matter they are about. And indeed for that reason I am inclined to rely more on epistemic indifference, since essentially the same conclusions can be drawn from both theories. But the fact that they do coincide leads me to believe that I may be reasoning correctly, since they share no premises in common and yet say basically the same thing.

« Previous PageNext Page »

Blog at WordPress.com.