Archive for the ‘Information’ Category

Synergy between Wisdom of Crowds and Statistic’s Sample Size

January 19, 2008

The core idea behind the “wisdom of crowds” is that by aggregating information from a large, diverse group of individuals you can obtain a better solution and make better decisions.

Today I was reading Statistics for Dummies by Deborah Rumsey and realized that the motivation for the wisdom of crowds is quite analogous to the motivation for having a large sample size in statistics, as can be seen in these snippets from the book:

Fewer participants in a study means less information overall, so studies with small numbers of participants in general are less accurate than similar studies with larger sample sizes … Most researchers try to include the largest sample size they can afford, and they balance the cost of the sample size with the need for accuracy … Check the sample size to be sure you have enough information on which to base your results.

Information Patterns – very exciting!

December 10, 2007

Science is all about identifying patterns and then representing those patterns abstractly.  Abstract representations allow us to contemplate the properties of a whole class of things, rather than treating each thing on a case-by-case basis.

For example, long ago people noticed that when an object is lifted and released, it falls down to earth.  It doesn’t matter whether the object is a rock or feather or anything else.  All objects exhibit the same pattern of behavior.

The Internet is all about exchanging information.  Certainly we should be able to identify patterns in that vastness of information.  Thus we are challenged to become information “scientists”: identify patterns in information, create abstract representations of the patterns, and then for each particular instance (i.e. each web document) relate it to an abstract representation.  Once we – the community of web designers – start to do this then we will be able to do some exciting very things.  We will, for example, be able to collect all instances of an information pattern and (1) recognize that they are all of the same class of things, and (2) aggregate, manipulate, and massage the information in ways that make sense for that class of information.

Here is an example of an information pattern: Garlic Lowers [Does Not Lower] Cholesterol

Outsource it or do it in-house?

November 20, 2007

When should an organization outsource a job and when should an organization do the job in-house? Let’s consider an example.

Consider a book publisher. Rather than having a staff of full-time writers whom it would pay to write books, the publisher bids for books and negotiates with agents.

Book publishers outsource because they want to have access to the maximum diversity of ideas and information. A publisher thinks its chances of publishing interesting books are better it leaves the door open to lots of different writers, and so it’s willing to endure the hassle of having to sign each book on a case-by-case basis. The benefits of leveraging the actions and intelligence of the crowds outweigh the costs.

Need to tap into the collective intelligence? Then outsource it.

Need things done quickly? Then do it in-house.

— Extracted from The Wisdom of Crowds by James Suroweicki

Not everything that can be counted counts …

September 16, 2007

“Not everything that can be counted counts, and not everything that counts can be counted.”  [Albert Einstein]

Example: when researching a species these things are important, but cannot be counted: the texture of the skin, the color, the smell.

So, it’s not only quantitative data that is useful when collecting evidence and information.

— Extracted from Hard Facts by Jeffrey Pfeffer and Robert I. Sutton

Is there more information in a rock than in the human genetic DNA code?

August 18, 2007

How much information is in here:

  • It is a fine day.

Let’s measure the information by the number of characters. There are 17 characters, so the amount of information is 17.

How much information is in here:

  • It is a fine day. It is a fine day. It is a fine day. It is a fine day. It is a fine day.

The same sentence is repeated five times. Is the amount of information 17 x 5 = 85? Answer: No. The extra four sentences don’t say anything new. The information that is present is this:

  • Repeat 5 times: It is a fine day.

The number of characters in this is: 33

How much information is in here:

  • 3.1415 …

This is the value of pi. Suppose one million digits are displayed. None of the digits repeat, so you might be tempted to say that the amount of information is one million. Not so. The information can be represented succinctly as:

  • Pi to one million digits

Now the amount of information is just 24.

How much information is in here:

  • dkdl;eekrkpeosfdlzdmc;dsfkdopkfospkfs;dlkflas;krwe0q0–03ospaaj

This is just a random sequence of 63 characters. If any random characters will do, then the information can be represented simply as:

  • Random sequence of 63 characters

There are 33 characters.

Something is “information” if it is meaningful, non-random, and unpredictable [1].

How much information is in a rock? If we were to characterize all the properties (location, angular momentum, spin, velocity, and so on) of every atom in the rock, we would have a vast amount of information. A one-kilogram rock has 100000000000000000000000000000 (29 zeros) atoms. That’s one hundred million billion times more information than the genetic code of a human race. But for most common purposes, the bulk of this information is largely random and of little consequence. So we characterize the rock for most purposes with far less information just by specifying its shape, location, and the type of material of which it is made. Thus, it is reasonable to consider the information of an ordinary rock to be far less than that of a human even though the rock theoretically contains vast amounts of information.

[1] If you know what’s going to be said (i.e. it’s predictable) then it’s not information.

– – Extracted from The Singularity is Near by Ray Kurzweil