Monday, August 4, 2008

Star Plots

- Star plots allow you to compare multiple variables for each observation.

- This star plot was used to represent the differences in automobiles in 1979. I thought this was kinda of cool to see how different individuals came up with the different ways to star plot a vehicle.

- http://www.itl.nist.gov/div898/handbook/eda/section3/gif/starplot.gif

Similarity Matrix


- A similarity matrix is a matrix of scores which express the similarity between two data points.

-This display shows correlations/co-occurrences between two groups of objects, or between objects belonging to the same group. One group of objects is presented as the rows of the similarity matrix and the other as the columns. The matrix cells are color-coded based on the similarity values. Blue indicates positive values and red negative ones. The scale of the colors is -1..1 by default, but if the similarity values exceed this region, the thresholds are scaled such that full blue and red colors represent the 90% percentile of the value range.

Stem and Leaf Plot



- A stem and leaf plot is a device for presenting quantitative data in a graphical format, similar to a histogram, to assist in visualizing the shape of a distribution.

-Study the data for Infant Mortality Rates, the number of infant deaths per 1,000 live births, of countries in Western Africa. Use the World Population Data Sheet in your Reference Section. You will notice that most of these values are in the interval 100 to 175. A stem-and-leaf plot for this data is shown below.

Box Plot

- A boxplot, or box and whisker diagram, provides a simple graphical summary of a set of data.
- The lines on the box plot image above represent the locations of the lowest number, first quartile, second quartile, third quartile and highest number, in relationship to a scale, running from 0 - 140.

Histogram

- A histogram displays tabulated frequency.

- Histogram of column. This chart displays a bar chart of the relative frequencies of different value ranges. It's useful for getting a quick look at the distribution of values of the selected metric column. The more values fall in a range, the higher the bar. In the above chart you can see that a typical procedures name consists of 7-16 characters. Shorter and longer names are fewer. There are no names with just 1 or 2 characters.

- http://www.aivosto.com/project/help/pm-histogram.gif

Parallel Coordinate Graph



- In the parallel coordinate plot, each variable is graphed on a vertical axis. A data
element is plotted as a connected set of points, one on each axis.

- This parallel coordinate (PC) graph (4) serves as the control panel for selecting attributes to be explored and provide easier identification of multivariate relationships across spatial domains in the choropleth maps and the scatter plot.

Triangular Plot



- A triangular plot usually uses three variables which sum to a constant. It graphically depicts the ratios of the three variables as positions in an equilateral triangle.

- The axes of the figure show the estimated fraction of the population intending to vote for each of the major parties; the white circle shows the current estimate from opinion polls. The coloured asses show the regions of the plot in which -- under the assumption of uniform national swing -- each of the corresponding major parties would win a majority in Parliament.