Square sizes and number of documents

Search Visualizer lets you vary the size of the squares in the images, and vary the number of documents shown per screen. This combination of features gives you a lot of power and flexibility in how you deal with searches, images and documents.

Square sizes

The option for changing square sizes is located under “More options/Other settings”.

Image

If you click on “More options” you can then choose “Other settings”.

Image

You then have a choice of five square sizes.

Image

You can use this feature to select a square size that you find comfortable to view.

You can also use it to show more detail in an image, particularly in combination with the “results per screen” option.

Results per screen

The standard SV view shows ten results per screen when you search the Web. You can change this setting with the “results per screen” option, which is located under “More options”.

Image

Image

SV currently offers you the option of 10, 5, 2 or 1 record per screen. Each of these is useful for different purposes.

For Web searching where the documents are relatively short, the settings of 10 results per screen and 5 results per screen let you work through large numbers of results swiftly, without needing to scroll down to see an entire record.

If you’re working with large documents, or if you’re looking for patterns within a document you’ve already found, then the settings of 2 or 1 per screen are useful. If you use one of these settings, and combine it with a square size of “smaller” or “tiny” then you can display very large documents.

Choosing a combination of settings: some examples

10 results per screen

Here’s a screenshot of a search for wind wave solar on a setting of 10 results per screen.

Image

The Science Online article second from the left has some clearly defined bands of colour within the image, but the image goes off the bottom of the screen, so we would have to scroll down to see what happens in the rest of the image.

5 results per screen

Here’s the same search for wind wave solar on a setting of 5 results per screen.

Image

On this setting, we’re able to see the entire record without needing to scroll down.

This type of colour banding is useful for identifying documents that are clearly structured into sections with separate themes. It’s particularly useful if you’re trying to find something which isn’t a “usual suspect” – for instance, types of renewable energy other than wind, wave and solar. Our article on  searching with the Boolean NOT covers this in more detail.

2 results per screen

The setting of 2 results per screen is particularly useful for comparing and contrasting the patterns in two documents, showing them side by side. SV can easily fit two entire Shakespeare plays onto a single screen, as in this example, which shows mentions of death and sleep in Midsummer Night’s Dream and Macbeth.

This pair of images shows that both plays open with mentions both of death and sleep, and mention both themes repeatedly after that; however, they diverge at the end, with two closing mentions of sleep in Midsummer Night’s Dream and closing mentions of death in Macbeth.

Image

The setting of 1 result per screen is useful for handling a single large document. We’ve included some substantial texts on the SV website, in the “sample texts” option, so that people can practise using SV features on them.

The “sample texts” option is located to the left of the search bar.

Image

The largest sample text at present is the main Gettysburg document; this contains the official war record for the Battle of Gettysburg, and is over half a million words long. On a laptop, with the “tiny squares” setting and the “1 record per screen” option, each screenful of image represents about 80,000 words; you can search the entire document with eight scroll-down clicks.

If you save the document as an image file, then you can shrink it even more, though the shrinking process will eventually hit problems with image quality; this example is pushing the principle to the edge. The illustration below shows what you get using a standard “save as image” without any enhancement. If you’re reasonably familiar with image processing, then you can sharpen the image to some extent. In practice, we find that documents of about a hundred thousand words – standard book length – come out as crisp images. When you go beyond a quarter of a million words, then you may need to sharpen the images after you’ve saved them.

Here’s the image for mentions of cavalry within the Gettysburg document. It’s the classic pattern for a nineteenth century land battle: the first encounter is between the scouting cavalry from each side, then the battle is dominated by infantry and artillery, and then at the end, the losing side withdraws, screened by their own cavalry, and harried by the victors’ cavalry.

One striking feature is how suddenly the number of mentions of cavalry increases towards the end of the document.

That final change in frequency of mentions really is as abrupt as it looks; here’s a closeup of the transition point.

Image

That sort of striking pattern is useful not just for analysing texts, but also for purposes such as finding the right section in a book or large online manual. (There’s an urban legend that the hard copy manuals for a 747 airliner weigh so much that it couldn’t take off with all of them on board. That may be pushing the truth a bit, but finding the right section within a big, poorly-written manual isn’t usually fast or fun, so this facility in SV can make life significantly easier.)

Finding the right balance

It’s a good idea to experiment with square sizes and the number of records per screen, to get a feel for how they interact, so you can find the best combination for your needs.

For instance, if the document contains a pattern of clearly defined themes, that will usually be more visible with some square sizes and records per screen than with others. As a rough rule of thumb, the narrower the column you’re seeing on screen, the easier it is to see themes, though that also means that you won’t usually see the whole document in one go. Here’s an example.

This is a closeup of the image for wind wave solar on a setting of 5 results per screen.

Image

On this setting, the pattern is clear. Here is the same search using the “enlarged image” option, which produces a much broader image.

Image

Although the pattern is still visible, it’s not as obvious as in the longer, thinner image.

Conclusion

This combination of settings makes it possible to see patterns in very large documents. For anyone working with literary texts, this can give powerful insights into themes and structures.

This is also useful if you’re trying to locate the relevant section within a large online technical manual.

These are topics to which we’ll return in other articles.

Gordon Rugg

Advertisements

About searchvisualizer

We welcome debate and disagreement, but not abuse, trolling or thread derailment. We reserve the time-honoured right of blog owners and moderators to be arbitrary, capricious and autocratic in our wielding of the ban hammer. Gordon Rugg is a former timberyard worker, archaeologist and English lecturer who ended up in computer science via psychology. He’s the same Gordon Rugg who did the Voynich Manuscript work, and the books with Marian Petre about research. He’s co-inventor of the Search Visualizer.
This entry was posted in About SV. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s