Understanding big data

Before trying to access and use government data, we should take a step back and try and understand it.

First, what are we talking about when we say “big data”? 

A popular term used to describe the exponential growth and availability of data, both structured and unstructured.  Ie, there’s lots of it, it moves very quickly, it’s in lots of formats, and normalizing it so we can understand it is very complex.

Jean O’Grady.

Big data slides.

Next, what is the difference between structured and unstructured data?

Bob Ambrogi provided the overview here. Listen.

I summarize below.

What is unstructured data?

  • Origin: comes from a broad range of different sources.
  • Challenge: Curation. Pulling the data together and then relating the data to a particular research question.
  • Example: legislative and pre-legislative materials
  • Company: voxgov aggregates data from a variety of sources, and then provides tools to make that data understandable and useful.

What is structured data?

  • Origin: Exists in a structured database collection.
  • Challenge: The structure itself will create limitations on the usefulness and accessibility.
  • Example: Pacer is a key example of a database where the structure provides a degree of usefulness but also limits usefulness.
  • Example: PacerPro provides new access to PACER data along with tools to make that PACER data more useful.

Need to tap unstructured government data to understand legislative or regulatory intent? Join us tomorrow to learn about  concrete tools provided by voxgov that give you a competitive edge.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s