Overview Our Data Harvesting Our Data Issue Areas and "Sets" Technology and Thanks Questions? Comments?

IssueLab logo

IssueLab Data Share


Overview

We are excited to share the nonprofit-produced research data we archive.

Everything you need to know to get started with our data shows below. However, we hope you will take a little time to understand the ideas behind a couple of protocols and standards that we adhere to in order to provide our data to the widest audience.

We use the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) to provide our data. Never heard of OAI-PMH? All you ever wanted to know is available on the OAI website. If you don't have a lot of time (and who does?) scan this informative tutorial to get the lay of the land. In a nutshell,

"The essence of the open archives approach is to enable access to Web-accessible material through interoperable repositories for metadata sharing, publishing and archiving. It arose out of the e-print community, where a growing need for a low-barrier interoperability solution to access across fairly heterogeneous repositories lead to the establishment of the Open Archives Initiative (OAI). The OAI develops and promotes a low-barrier interoperability framework and associated standards, originally to enhance access to e-print archives, but now taking into account access to other digital materials. As it says in the OAI mission statement "The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content."

From "OAI for Beginners: Basic OAI concepts and features"

In keeping with OAI-PMH, our data is expressed in XML and uses unqualified Dublin Core. There is a lot more to learn about Dublin Core on the DC website. Here's a sentence to give you the gist of what's going on:

The Dublin Core Metadata Initiative (DCMI) is an organization dedicated to promoting the widespread adoption of interoperable metadata standards and developing specialized metadata vocabularies for describing resources that enable more intelligent information discovery systems.

From "Dublin Core Metadata Initiative: About the Initiative"


Our Data

Our data is delivered in XML -- one XML file per research listing. This chart provides info on all the data points we collect. Note that not every research listing includes all of these data points. At a minimum, the data points that appear in bold type will be included in every XML file.

Data pointsDescription
Coverage<dc:coverage>The geographic areas the research considers (uncontrolled list).
Creator<dc:creator>The author(s) of the research.
Date<dc:date>The date the research was published. Format YYYY-MM-DD.
Description<dc:description>The summary provided for the research work.
Format<dc:format>When available, the file format of a saved resource. Formats include: pdf; doc; xls; rtf; txt; ppt, etc. Note: If a record includes "<dc:format>scribd;xxxxx;xxxxx</dc:format>", a Scribd.com document is available. More info.
Identifier<dc:identifier>The listings unique.
Language<dc:language>The language in which the research was authored/published.
Publisher<dc:publisher>The nonprofit organization(s) that made the research in question available to/through the IssueLab archive.
Rights<dc:rights>The copyright and usage instructions for the research work.
Subject<dc:subject>The issue areas that the research work falls under (see "Issue Areas and Sets" below). Research works can fall into up to three issue areas.
Title<dc:title>Title of the research work.
Type<dc:type>The research can be categorized as these types. Controlled list: CaseStudy; Dataset; Ethnography; Evaluation; FactSheet; InteractiveResource; Literature/Research Review; MovingImage; Policy/Issue Brief; Presentation/Slideshow; Report/Whitepaper; StillImage; Survey; Testimony; Toolkit/Guide.

XML Example

Here is an example of the contents of one research metadata record:

<?xml version="1.0" encoding="UTF-8" ?> 
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2008-04-09T18:53:51Z</responseDate> 
  <request identifier="2003_2004_statewide_survey_of_immigrants_and_refugees" verb="GetRecord" metadataPrefix="oai_dc">http://harvest.issuelab.org/provider/oai</request> 
  <GetRecord>
    <record>
      <header>
        <identifier>oai:harvest.issuelab.org:2003_2004_statewide_survey_of_immigrants_and_refugees</identifier> 
        <datestamp>2008-04-09T17:26:47Z</datestamp> 
        <setSpec>human_rights_and_civil_liberties</setSpec> 
        <setSpec>education_and_literacy</setSpec> 
        <setSpec>immigration</setSpec> 
      </header>
      <metadata>
        <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
          <dc:title>2003-2004 Statewide Survey of Immigrants and Refugees</dc:title> 
          <dc:subject>education_and_literacy</dc:subject> 
          <dc:subject>human_rights_and_civil_liberties</dc:subject> 
          <dc:subject>immigration</dc:subject> 
          <dc:creator>Illinois Coalition for Immigrant and Refugee Rights</dc:creator> 
          <dc:publisher>Illinois Coalition for Immigrant and Refugee Rights</dc:publisher> 
          <dc:date>2007-08-01</dc:date> 
          <dc:description>This report weaves together demographic and field research conducted by ICIRR in 2003 to assess the needs of immigrants and refugees throughout Illinois and to recommend the formation of an Illinois immigrant integration policy.</dc:description> 
          <dc:identifier>http://www.issuelab.org/research/2003_2004_statewide_survey_of_immigrants_and_refugees</dc:identifier> 
          <dc:type>Text</dc:type> 
          <dc:language>eng</dc:language> 
          <dc:format>pdf</dc:format> 
          <dc:format>scribd;37067679;key-174p87aga4148zpw329a</dc:format> 
       </oai_dc:dc>
      </metadata>
    </record>
  </GetRecord>
</OAI-PMH>

Harvesting IssueLab's Data

The Six Verbs

As we mentioned above, we use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to let others access our data. There are six case-sensitive commands, or "verbs", provided by the OAI protocol for querying purposes:

  1. Identify - identifies the archive you are harvesting.
  2. ListSets - provides a list of the sub-sets or groupings of data offered (see "Issue Areas and Sets" below).
  3. ListMetadataFormats - provides info about the formats we offer our metadata in; we offer metadata in oai_dc.
  4. ListIdentfiers - provides a list of research listing identifiers available to harvest.
  5. ListRecords - a list that includes an individual metadata record for every research listing available in the IssueLab archive.
  6. GetRecord - you issue this HTTP request in conjunction with a record identifier to obtain the metadata for the specified record only.

Here are the most basic HTTP requests you can issue. Note: a few verbs require that you include the "metadataPrefix" parameter as well as the "verb"; the "metadataPrefix" shows in green below:

http://harvest.issuelab.org/provider/oai?verb=Identify [Try it!]

http://harvest.issuelab.org/provider/oai?verb=ListSets [Try it!]

http://harvest.issuelab.org/provider/oai?verb=ListMetadataFormats [Try it!]

http://harvest.issuelab.org/provider/oai?verb=ListIdentifiers&metadataPrefix=oai_dc [Try it!]

http://harvest.issuelab.org/provider/oai?verb=ListRecords&metadataPrefix=oai_dc [Try it!]

http://harvest.issuelab.org/provider/oai?verb=GetRecord&identifier=record identifier&metadataPrefix=oai_dc [Seeded - try it!]


Verbs + Parameters = Better Results!

You can actually do quite a bit with these verbs when mixed with some other tricks. For example, you can ask for all records saved to the IssueLab repository between 2010-01-06 and 2011-01-01 within our "Health and Medicine" issue area like this (you must use date format: YYYY-MM-DDTHH:MM:SSZ):

http://harvest.issuelab.org/provider/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2010-01-06T16:25:14Z&until=2011-01-01T23:59:59Z&set=health_and_medicine

Want records in a particular issue area stored from a particular date until the present? Just leave off the "until" parameter. Example:

http://harvest.issuelab.org/provider/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2010-01-06T16:25:14Z&set=health_and_medicine


An Important Note About the "date" Parameter

The "from and "until" date parameters available to you pertain to the date a record became available in our data provider service and not the date that a research listing was originally published. To search on records by date of publication (eg., all research listings that were originally released on a certain date), use our website's search tool.


HTTP Request Parameters

Here are the arguments that you can use in your HTTP request (don't forget -- nothing happens unless you issue one of the six verbs!):


Issue Areas or "Sets"

Interested in retrieving data about a particular IssueLab issue area? We provide the following "sets" -- OAI-PMH's way of letting you hone in on just the data you want. To use the set parameter, you would use a URL like this one (we'll request all research listed under the "animals" issue area):

http://harvest.issuelab.org/provider/oai?verb=ListRecords&metadataPrefix=oai_dc&set=animals

Here's the complete list of issue areas with corresponding set name. You will use the set name in your HTTP request:

Issue area Set HTTP request: http://harvest.issuelab.org/provider/
Aging aging oai?verb=ListRecords&metadataPrefix=oai_dc&set=aging
Agriculture and Food agriculture_and_food oai?verb=ListRecords&metadataPrefix=oai_dc&set=agriculture_and_food
Animal Welfare animal_welfare oai?verb=ListRecords&metadataPrefix=oai_dc&set=animal_welfare
Arts and Culture arts_and_culture oai?verb=ListRecords&metadataPrefix=oai_dc&set=arts_and_culture
Athletics and Sports athletics_and_sports oai?verb=ListRecords&metadataPrefix=oai_dc&set=athletics_and_sports
Children and Youth children_and_youth oai?verb=ListRecords&metadataPrefix=oai_dc&set=children_and_youth
Civil Society civil_society oai?verb=ListRecords&metadataPrefix=oai_dc&set=civil_society
Community and Economic Development community_and_economic_development oai?verb=ListRecords&metadataPrefix=oai_dc&set=community_and_economic_development
Computers and Technology computers_and_technology oai?verb=ListRecords&metadataPrefix=oai_dc&set=computers_and_technology
Consumer Protection consumer_protection oai?verb=ListRecords&metadataPrefix=oai_dc&set=consumer_protection
Crime and Safety crime_and_safety oai?verb=ListRecords&metadataPrefix=oai_dc&set=crime_and_safety
Disabilities disabilities oai?verb=ListRecords&metadataPrefix=oai_dc&set=disabilities
Education and Literacy education_and_literacy oai?verb=ListRecords&metadataPrefix=oai_dc&set=education_and_literacy
Employment and Labor employment_and_labor oai?verb=ListRecords&metadataPrefix=oai_dc&set=employment_and_labor
Energy and Environment energy_and_environment oai?verb=ListRecords&metadataPrefix=oai_dc&set=energy_and_environment
Gay, Lesbian, Bi and Trans gay_lesbian_bi_and_trans oai?verb=ListRecords&metadataPrefix=oai_dc&set=gay_lesbian_bi_and_trans
General general oai?verb=ListRecords&metadataPrefix=oai_dc&set=general
Government Reform government_reform oai?verb=ListRecords&metadataPrefix=oai_dc&set=government_reform
Health health oai?verb=ListRecords&metadataPrefix=oai_dc&set=health
Housing and Homelessness housing_and_homelessness oai?verb=ListRecords&metadataPrefix=oai_dc&set=housing_and_homelessness
Humanitarian and Disaster Relief humanitarian_and_disaster_relief oai?verb=ListRecords&metadataPrefix=oai_dc&set=humanitarian_and_disaster_relief
Human Rights and Civil Liberties human_rights_and_civil_liberties oai?verb=ListRecords&metadataPrefix=oai_dc&set=human_rights_and_civil_liberties
Human Services human_services oai?verb=ListRecords&metadataPrefix=oai_dc&set=human_services
Hunger hunger oai?verb=ListRecords&metadataPrefix=oai_dc&set=hunger
Immigration immigration oai?verb=ListRecords&metadataPrefix=oai_dc&set=immigration
International Development international_development oai?verb=ListRecords&metadataPrefix=oai_dc&set=international_development
Journalism and Media journalism_and_media oai?verb=ListRecords&metadataPrefix=oai_dc&set=journalism_and_media
Medical Research medical_research oai?verb=ListRecords&metadataPrefix=oai_dc&set=medical_research
Men men oai?verb=ListRecords&metadataPrefix=oai_dc&set=men
Nonprofits and Philanthropy nonprofits_and_philanthropy oai?verb=ListRecords&metadataPrefix=oai_dc&set=nonprofits_and_philanthropy
Parenting and Families parenting_and_families oai?verb=ListRecords&metadataPrefix=oai_dc&set=parenting_and_families
Peace and Conflict peace_and_conflict oai?verb=ListRecords&metadataPrefix=oai_dc&set=peace_and_conflict
Poverty poverty oai?verb=ListRecords&metadataPrefix=oai_dc&set=poverty
Prison and Judicial Reform prison_and_judicial_reform oai?verb=ListRecords&metadataPrefix=oai_dc&set=prison_and_judicial_reform
Race and Ethnicity race_and_ethnicity oai?verb=ListRecords&metadataPrefix=oai_dc&set=race_and_ethnicity
Religion religion oai?verb=ListRecords&metadataPrefix=oai_dc&set=religion
Science science oai?verb=ListRecords&metadataPrefix=oai_dc&set=science
Substance Abuse and Recovery substance_abuse_and_recovery oai?verb=ListRecords&metadataPrefix=oai_dc&set=substance_abuse_and_recovery
Transportation transportation oai?verb=ListRecords&metadataPrefix=oai_dc&set=transportation
Welfare and Public Assistance welfare_and_public_assistance oai?verb=ListRecords&metadataPrefix=oai_dc&set=welfare_and_public_assistance
Women women oai?verb=ListRecords&metadataPrefix=oai_dc&set=women

A Note about our <dc:format>scribd....</dc:format> and Scribd

If a record includes "<dc:format>scribd;....</dc:format>", a Scribd.com document is available. Everything you need to embed a thumbnail graphic or entire dynamic document, generated at Scribd.com, appears in this line as follows.

Example: <dc:format>scribd;37067679;key-174p87aga4148zpw329a</dc:format>

The data that appears after the first semi-colon (;) below -- underlined and in bold -- is Scribd's document ID number.

<dc:format>scribd;37067679;key-174p87aga4148zpw329a</dc:format>

The data that appears after the second semi-colon (;) below -- underlined and in bold -- is Scribd's access key.

<dc:format>scribd;37067679;key-174p87aga4148zpw329a</dc:format>

You will need both the document ID and the access key to generate/embed resources from Scribd. Complete information about using Scribd's API are available on the Scribd.com website.


Technology and Thanks

We are using data provider software created and shared by the University of Michigan. Enter verb=Identify and it's UMich's umich_oai_toolkit that returns the results. :) Complete information is here.

This software package is available at no cost. We are happy to do our part in spreading the word about these terrific open source tools. We wrote our own PHP script to transfer our XML files into umich_oai_toolkit's required MySQL blob format. We're happy to share our open source record loader software.


Questions? Comments?

Don't hesitate to get in touch with us should you require assistance. We're happy to help! Send a note to oai at issuelab.org.


Page last updated: 2011-02-08

http://www.issuelab.org
IssueLab: bringing nonprofit research into focus! Locate, access, engage.

http://www.issuelab.org/enews
We've got issues. Read all about it in IssueLab eNews!

http://www.issuelab.org/create_an_account
Create an account and add your organizations research or join the Issuelab RatPack!

http://harvest.issuelab.org
Come 'n git it! IssueLab's research data is ready for harvesting!