So what is the Search Visualizer, and what is Search Vis Ltd, and why should anyone care?

 The Search Vis Ltd question is easy to answer: it’s the company that we set up to commercialise the Search Visualizer (SV) software (www.searchvisualizer.com).

 The SV software produces images like this.

"witch" and "sleep" in MacBeth

This image shows you where the words “witch” (in green) and “sleep” (in red) occur in Macbeth. The whole of Macbeth, in one image. You can see immediately that they’re not randomly distributed – instead, they cluster and they form patterns. That ability to see significant patterns is at the heart of SV.

We produced the SV because the ordinary search engines and text analysis packages weren’t doing a lot of things that we wanted to do, but that we could do via images like the one above.

Take online search, for instance. Any of the ordinary search engines can find you several million potentially relevant records in under a second. Then what? Then you have to wade through them, one by one, trying to work out which ones actually are relevant. At that point, you’re limited by reading speed. Even if you’re a really fast reader, you’ll still only be able to assess a few records per minute. There had to be a better way.

Similarly, there are lots of types of search that are difficult with ordinary search engines. Suppose you’re trying to find a particular James Smith, but you don’t know whether they might have a middle name that will show up in official records, so searching on “James Smith” as a phrase wouldn’t catch “James F. Smith” or “James Frederic Smith”. Some search engines will let you search for the words “James” and “Smith” within a specified distance of each other, if you don’t mind going into the advanced search options and learning how to use the relevant Boolean command. That had room for improvement.

If you’re working with documents, as a historian or someone interested in literature, or analysing transcripts from interviews, then there’s software which lets you do very sophisticated statistical analysis and content analysis, if you don’t mind spending a fair chunk of money on a license, and going on a training course to learn how to use it. Not ideal.

We spotted that all these problems involved the same underlying issue. Often, you don’t need to read the words themselves – that slows you down, and gets in the way. Often, what you want is to see the patterns that the words form – that’s far faster than reading, and lets you see the wood, not the trees.

So we built the Search Visualizer. It works by colour-coding your chosen keywords, and then showing you where they occur within a schematic image of each document. That way, you can very swiftly identify relevant documents, or relevant sections within a document, or patterns and structures within documents. It’s a whole new way of making sense of them.

One thing that struck us during the testing of SV was how quickly the users found creative ways of using SV for their particular needs. There wasn’t just a single way of using it; instead, most users were swiftly identifying a specific task that they would use the SV for. One user regularly worked with technical documents several hundred pages long, and loved being able to find the relevant section within a document in a matter of seconds rather than minutes. Another user wanted SV because it let her find relevant records easily on internet searches, when the ordinary search engines were finding huge numbers of false positives. Another user found it invaluable for finding people. People working in the humanities were very interested in being able to see structures and literary patterns within historical records and literary texts.

That’s why we’ve set up this blog. It’s intended to be a place where people can exchange ideas, information, hints and tips about SV in relation to their work, their interests and their field. We hope that it will spark off similar specialist blogs within fields where SV helps people get new perspectives and new ideas. We’ve cross-linked to the SV site (www.searchvisualizer.com) so that you can see what’s happening over there. The online SV is free, and lets you search the Internet, or search a specified site on the Internet, or search the classic texts that we’ve put on the SV site (currently some Shakespeare plays, and some of the official records from the American Civil War).

We hope you’ll find it useful and enjoyable.

Today’s closing thought: Here’s an SV image of the opening section of “Romeo and Juliet” followed by one of the closing section of the same play, with the word “love” in red and the word“death” in green.

"love" and ""death" in the opening part of Romeo and Juliet

There’s a juxtaposition and a mirroring of the two themes. In the opening section, two initial mentions of love are followed by several mentions of death. In the closing section, two final mentions of love are preceded by several mentions of death. It’s not a perfect mirror image, but it’s close.

"love" and "death" in the closing part of Romeo and Juliet

The thought is: Are the juxtaposition and the mirroring likely to be deliberate literary devices, or are they more likely to be subconscious uses, or coincidence?

Gordon Rugg is a former timberyard worker, archaeologist and English lecturer who ended up in computer science via psychology. He's the same Gordon Rugg who did the Voynich Manuscript work, and the books with Marian Petre about research. He's co-inventor of the Search Visualizer.
