Archive for May, 2006

Searching for Everything

Saturday, May 20th, 2006

Google’s mission in life — apart from making tons of money, becoming a verb, and (mostly) refraining from things evil — appears to be to provide the greatest amount of information possible to the greatest number of people.

Fine idea. And Google’s certainly doing a mighty fine job as far as the information on the web goes. But what about the great gobs of information that exist outside of the web itself? Some of this information sits on dusty library shelves, not yet in electronic form, and Google has, in fact, embarked on a very ambitious project to digitize entire college libraries (and copyright be damned, I say).

But there’s also a whole universe of information out there that already is in electronic format, and that already is accessible via the internet. But none of it is — yet — available through a Google search. If I had to venture a guess (and it appears that, yes, I have to!), I’d say these missing pieces of electronica are probably as large as the current, Google-searchable, piece of the internet.

And not only are they large…most of it is high-quality stuff. Materials that are consciously archived are generally deemed to be worthy of the effort exactly because they represent high quality information (searching the net is a blast, but we all know that there’s a lot of garbáge out there). But that high-quality-but-invisible information isn’t showing up when you Google for it. Information like…

PACER. The Public Access to Court Electronic Records is the US government’s effort at bringing the federal courts into the electronic age. They have semi-succeeded. PACER is an enormous dataset of federal court records that includes not only opinions, but the extensive, often arcane filings, that go into making up a case docket. It’s a combination of pointers to records along with actual documents themselves. It’s a mixed up, unwieldy, hit and miss, and very vast collection of information. It needs to be Googlized. There is a lot of other court information available at some federal courts that aren’t part of PACER, and at state and local courts as well, along with many court systems in other countries. All, all, all should be Googlized.

And while we’re on the topic of courts, there’s also…

Lexis-Nexis. How big is Lexis-Nexis. Can you say exabytes? OK, maybe not that large (yet)…but it’s well on it’s way. Lex-Nex is another source of court cases (including a lot of historical material) but it’s also much more: full-text newspapers, magazines, directories, professional forms, credit reports, public filings, attorney general opinions, and stuff I’m sure I haven’t discovered yet.

I have mixed feelings about linking Lex-Nex and Google. Google is free, Lex-Nex isn’t, and that can make for an unhappy marriage. But the content that Lex-Nex has to offer is so compelling, that it’s probably worth exploring opportunities. After all, Google already makes available snippets of information from subscription sources that — if you want the full text — you have to pay for it. The bulk of Google Scholar materials seems to fall into this territory.

Internet Archive. How can you not love the Wayback Machine? In spirit, this is the undertaking most like Google itself. In practice, though, it’s badly in need of a system upgrade. If Google would take Wayback by the hand, meld it with its own vast collection of archived pages, make the whole thing searchable, and basically lead Wayback into the light, then Oh, What a Wonderful World This Would Be.

Newspapers. The first rough draft of history is available online back at least to the 1700’s, in multiple archives, and from a variety of nations. In addition to mainstream publications like the New York Times, there are a host of others that offer important insights into corners of world history. like the Baltimore Afro American , the Johannesburg Sunday Times or the Sydney Morning Herald.

I haven’t even mentioned patents, copyright, trademarks, corporate annual reports (a particularly easy one!), and other electronic materials already available, though not search-engine-available.

Wouldn’t adding all this to a Google search just hopelessly clutter up the results page? It could, but it doesn’t have to. A search that recognizes that there is a lot of highly-relevant material in, say, a court case, could simply ask a question like “Do you wan’t to see these results with court cases included?”

Well…do you?

pafalafaga aka David Sarokin

Google searches and the plus sign

Friday, May 19th, 2006

The Google search syntax accepts a plus sign in front of a search term. There are two situations where you might want to do this.

Firstly, a plus sign tells Google that you really truly honestly do want to search for a common term. A simple search for book a love returns mostly results that contain the more common phrase “Book of Love”.

Searching for book +a love fixes this; so does searching for “book a love” if you want the words to appear in that order – because common terms within a phrase are always matched.

A second use of the plus sign is to tell Google not to mess with your search term. A search for well yields a page cluttered with results for Wells Fargo, whereas these are absent from a search for +well.

The plus sign is not a search operator that I would use often, but just occasionally it’s a useful tool to get some unwanted entries off the search results list.

The U.S. before the U.S.

Thursday, May 18th, 2006

Techtor left a comment about the Cahokia post which I answered in that section. However, I think it deserves a wider audience.

Techtor's comment: – – “but I’ll not be surprised to know if there was a US before the US was actually founded (well, something to that effect).”

Well, yes there was – the Iroquois Confederation.

I'm not going to write much about it because I doubt there is much more that I could say that isn't already in these websites. Except for one thing, about the position of women in the Iroquois Confederation, because it provides a glimpse of the philosophy on which it was based.

It was a long battle for women to be allowed to vote in the United States. This “enlightened European based bastion of freedom” was determined to keep women as second class citizens.

THE INDIAN WOMEN: We whom you pity as drudges
reached centuries ago the goal that you are now nearing

We, the women of the Iroquois
Own the Land, the Lodge, the Children
Ours is the right to adoption, life or death;
Ours is the right to raise up and depose chiefs;
Ours is the right to representation in all councils;
Ours is the right to make and abrogate treaties;
Ours is the supervision over domestic and foreign policies;
Ours is the trusteeship of tribal property;
Our lives are valued again as high as man's.

From The Six Nations: Oldest Living Participatory Democracy on Earth –

There is a ton of information jam-packed into the website above. I recommend it to any who might be interested in the subject.
The Constitution of the Iroquois Nations: The Great Binding Law. Gayanashagow –

Well, I lied, one other thing I'll write about is a character many in the English speaking world have known about since childhood. In fact, he was probably as responsible for the founding of the Iroquois Confederation as anybody – Hiawatha.

Hiawatha was a real person, not just a poem or a Disney cartoon character. I am going to editorialize here a bit.

Hiawatha was among the greatest of men ever to have lived on the North American Continent.

The poem about him by Henry Wadsworth Longfellow was in its own way a tribute, accurate or not. But when the Disney animators turned him into nothing more than a cutesy cartoon character, it was sort of like making George Washington no more than the equal of Micky Mouse. – End of editorial. – The following is quoted from my response to Techtor in the comments section:

“And for those of you who are familiar with the poem about Hiawatha, he actually lived and became the spiritual leader of the Haudenosaunee.

“Tadadaho” is the title for that position and is still used for the leagues spiritual leader. It means “The 50th Chief.”

Today Hiawatha’s current successor in that position of spiritual leadership is Sid Hill of the Onondaga nation.”

“Hiawatha and the Iroquois Confederation” –

Hiawatha lived during the 15th century according to most sources, though some have him as early as the 12th century.

If anybody would like to read the poem, all 22 chapters of it are here: –

I still cry at the end.

Not all great civilizations have left monuments in stone, fantastic art, towers reaching to the heavens, or even crumbling mud brick foundations.

Some have merely left their good reputations. And that may be a better memorial than many empty piles of stone and mud.


Brick movies

Thursday, May 18th, 2006

Did you ever hear about movies made with Lego bricks? They are becoming more and more popular and do already have cult status.

One of the best ones I´ve ever seen is the brick adaption of Stanley Kubrick´s epic movie “2001 – A Space Odyssey” . It condenses the complex film in a very short version but still quotes the most important scenes.

Dowloads are available in various formats: Divx, Mpeg, Quicktime. Enjoy and have fun!

If you know some german you will find more info on the official brick festival page.

The Google Robot FAQ

Thursday, May 18th, 2006

Just how does Google find the information that it needs? Through the Google Robot program, of course. However, the detailed workings of this program are not widely known.

Philipp Lenssen has put together a detailed Google Robot FAQ to shed some light on this.


(Image: Julia Eisenberg)

The Tools of a Web Researcher

Thursday, May 18th, 2006

This is our continuing series on how a web researcher works (at least how I do mine). Today we will focus on basic tools. You can use other websites and software but basically for simple searches, these are all you need.

a. Search Engine

Primarily I use Google and most of the examples here utilized the latter in solving search problems. However, you can also use other search engines and in fact as you read along you will discover that you should utilize the one appropriate for the task.

I suggest you use the one with which you are more comfortable with. If it is Yahoo it’s ok just as long you don’t limit yourself to it each and every situation. Using your preferred search engine enables you to learn its more advanced inner workings. However, there is also a reason why lots of people use the more popular search engines since they simply produce results. So in case you are not using one of the top search sites, it might be worthwhile to be comfortable with them as well.

As you go along you will discover that the more important stuff is the thought process as I discussed previously and technology only comes in second.

b. Download the latest plug-ins and readers

Isn’t it frustrating that at times the website, wherein the information is found cannot be displayed properly by your browser? When doing research on the web, it is inevitable that you will come across different websites that have the information in audio or video and other files in different formats.

This book assumes that you are already familiar with browsers like Internet Explorer so aside from search engines, plug-ins and readers are the most important tools for the web researcher. Plug-ins are add-on tools to your web browser (like Internet Explorer, Netscape or Firefox) that enables you to view and even hear information in different formats. Here are the most plug-ins and readers that you will need to download:

  1. Acrobat Reader – A very good pdf file reader. Most case studies and academic papers are in pdf formats.
  2. Flash Player – Not just for fun cartoons or games, some websites produce a wealth of information on data produced as flash animations. Usually instead of powerpoint presentations they put them in flash.
  3. Updated versions of Windows Media Player, Real Audio and Quicktime – If ever you need to hear the information as an audio file, you will need these players.
  4. Viewers for Microsoft Word, Excel and Powerpoint – You only need these in case you don’t have Microsoft office installed in your computers.

I arranged the software above in order of importance based on my personal experience. These are the types of files that you will encounter the most. Another thing, please have an anti-virus software since some of these files might be infected and could do harm to your system.

Search Engines trust their own Answers

Wednesday, May 17th, 2006

Yahoo also has an Answers service. It's very different from Google Answers, because the Yahoo questions are limited to 110 characters, and no cash changes hands. Yahoo Answers has now come out of beta and is being “integrated” into their core search services.

Yahoo was corporate-blogging about how “questions and answers are being surfaced within results“, and gave as an example a search for best dog for apartment, where the YA question/answer appears on the first page of the results.

Nicholas Carr decided to look into this more deeply. He found that a Google search for best dog for apartment returned a Google Answers question/answer on the first page of the results, with YA on page three.

Back on Yahoo, GA was nowhere to be seen, and MSN Search didn't return either the GA or YA pages.

Search engines trust their own answers.

For the Greater Google…Part II

Wednesday, May 17th, 2006

Back to the big question of “How can we make Google even better?” (and I’m not sure a death ray is best way to go, here)

My big suggestion for the day is…

Fixing Ctrl-C. One of the most basic tasks in working with text is the old cut-and-paste. It’s also one of the most @#$%^&*! intolerable. How many times have you tried to simply paste text from a web page into a document, only to have it come out jibberish, or have screwy line breaks, or a ton of unwanted characters, or wind up with invisible code being made visible?

And tables…oy vey! Pasting text from web columns into a spreadsheet is a fool’s errand, as likely to result in all the text being dumped into a single cell, as in any sort of neat, usable, formatted table. Cutting and pasting text and tables from PDF files is a Sisyphussian task, and attempting to cut/paste from a Google cache will more than likely freeze your window.

Use ‘Paste-Special’, you say? To which I say…Hah!

This isn’t all Google’s fault, of course. But that’s just the problem It’s nobody’s fault…and nobody’s working to fix it. If only Google would make it their mission, what an act of public service they would be performing. Heck, they’re all geniuses. Coming up with a good Ctrl-C/Ctrl-V fix should take ’em about ten minutes.

They’ve already got their Google Notebook in beta, so what better platform for working on the cut-and-paste problem?

It’s 11:00 a.m. in Washington DC as I post this. I’ll be looking for the fix by, shall we say, noon?

David Sarokin aka pafalafaga

The “allintext:” modifier

Wednesday, May 17th, 2006

Google supports various modifiers that you can use to refine your search query. An interesting one is “allintext:”. If you place this modifier before your query, Google will only return pages which contain all the query terms in the text of the page.

You can see this modifier in action by searching for [“to be or not to be”] (the square brackets indicate the beginning and end of the search text; you don’t type them in). Amongst the pages returned are some that don’t match the phrase exactly.

For example, in fourth position is a children’s page about two bees called 2Bee and Queen Nottoobee. Perhaps some other web page links to this page with the text “to be or not to be” as the hyperlink, or perhaps Google is just being terribly clever in returning this page, which certainly doesn’t contain the search term.

No problem: you can search instead for [allintext:”to be or not to be”]. Now the results drop by about a million, and every page has “to be or not to be” highlighted in its snippet.

Similarly, the “allintext:” modifier is useful to remove from the search results pages where the search words are present in the URL or page title, but not in the page content.

Prehistoric Civilization – Cahokia

Wednesday, May 17th, 2006

Pyramids topped with gleaming temples, basketball courts and other play fields, elaborate beading and feathering, organized streets and plazas, even the homes of the wealthier elevated for a better view.

Sounds like a good introduction to a story about the Maya or Aztec. But it is not.

It is about the Native American metropolis known as Cahokia, located in the American Midwest.

The civilization involved may very well be as high as those found in Mexico and Central America, the main visual difference being that the people of Cahokia, living on the Mississippi flood plain, built with packed earth, timber and thatching, rather than with stone.

The main technological difference being that the civilizations to the south had developed a means of writing (glyphs) and are classed as “historic” civilizations, where the people of Cahokia had not, thus “prehistoric.”

For about 500 years, Cahokia was the major hub of a prehistoric civilization that, at its peak, spread from Minnesota to Florida and across the southeast.

The city covered about six square miles and had a population of up to 20,000. Houses were arranged in neat streets and around open plazas. Cahokia was a planned city with elaborate public buildings and elite residences at its core.

The people of Cahokia enjoyed “widespread commerce; stratified social, political, and religious organization; specialized and refined crafts; and monumental architecture.” – quote from Cahokia Mounds State Historic Site –

What finally happened to the Cahokians is unknown, but the decline seems to have been gradual. It began in the 1200s and the site had been abandoned by 1400 CE.

However, we can safely say that about 800 years ago when the population was at its peak, Cahokia was one of the largest urban centers in the world. A massive wooden wall enclosed the heart of the city. Within that wall were the most important structures and the most elite neighborhoods.

The most impressive buildings at Cahokia were the temples and homes of the rulers, the grandest being the 5000 square foot home of the great chief atop the city’s central pyramid, Monks Mound. This structure probably had a combined use both religious and private residence.

There is also speculation that this building atop the central earthen pyramid had its roof and walls covered with sheets of mica, at least at some point in its history. Because of the reflective nature of that mineral, and depending on the angle of the sun, this structure would have glittered over the city in colors ranging from mother-of-pearl to pure gold. Another name for Cahokia is “City of the Sun” and with the golden light of sunrise and sunset reflecting in a blaze of glory from the mica covered temple, it may have seemed that the sun himself truly dwelled the heart of the city.

Cahokia was governed by a four-tiered socio-political hierarchy. The highest power was the chief who was also thought to be the brother of the sun. Just under the chief were his immediate family and friends who formed the elite class. These in turn controled the heads of family clans, who in turn directed the commoners. Status, gender, age and kinship all determined the role of each person.

Cahokian agriculture produced squash, pumpkins, sunflowers and corn. They also gathered nuts and berries such as pecans, hickory nuts, and blackberries, fished and hunted.

The Cahokians developed several leisure activities including music, song, and dance, along with games of chance and skill. In their free time, they played guessing games with shells, gambled with dice, and youngsters entertained themselves by attempting to catch hollow bones on the tips of a pointed stick to which the hollow bones were tethered. The main sport at Cahokia was “chunkey,” in which two players threw javelins at a rolling, concave stone, trying to mark the place where it would come to a stop. It seems a type of basketball was also played.

Archaeologists classify Cahokian civilization as “Mississippian” Culture.

For additional information:

Cahokia Mounds State Historic Site – From National Park Service –

Cahokia Mounds State Historic Site – From

Cahokia – From Wikipedia –

Cahokia Mounds Photo Gallery – From Archaeoblog –

Metropolitan Life on the Mississippi – From Washington Post –


Image courtesy of Cahokia Mounds State Historic Site

Till next time