Getting Started

IRW datasets are hosted on Redivis, a platform for academic data sharing and analysis. There are a few different ways you can access IRW data—whether you’re just browsing, downloading tables, or working programmatically in R or Python.

NOTE: While you can explore datasets without logging in, a Redivis account is required to download data or use programmatic tools. You can create an account here.

The sections below walk through each access option and include example code to help you get started.

Option 1: Browse and Download from the Redivis Web Interface

If you’re just looking to explore or manually download individual datasets, you can:

View datasets through the IRW Data Browser
Download tables as CSV files from the Redivis web interface (Redivis login required).

This is the simplest way to get started if you don’t need programmatic access.

Option 2: Use the `irw` R Package (Recommended)

We recommend the irw R package if you work in R and want a simple, streamlined way to access IRW data. For complete setup instructions and a reference guide to available functions, visit the package website here.

On first use, you’ll be prompted to log in with your Redivis account and grant access via OAuth. This authentication step is required once per R session. For details, see the “Redivis Authentication” section on the package website.

Setup & Example Usage

Code

# Install and load the package
devtools::install_github("itemresponsewarehouse/Rpkg")
library(irw)

irw_info()                  # Overview of the IRW
irw_list_tables()          # List available tables
irw_filter(var = "rt")     # Search for tables with a specific variable
df <- irw_fetch("4thgrade_math_sirt")

Option 3: Use Redivis Client Libraries (R or Python)

If you prefer working outside of R or want low-level access to Redivis features, you can use the official Redivis client libraries. These are available for both R and Python.

Example access to IRW with Redivis Client Libraries:

R (redivis-r)
Python (redivis-python)

Code

# first install redivis R package: 
devtools::install_github("redivis/redivis-r", ref="main")
library("redivis")

dataset <- redivis::user("datapages")$dataset("item_response_warehouse") # connect to IRW
df <- dataset$table("4thgrade_math_sirt")$to_tibble() # download data

Code

# first install redivis Python package with `pip install --upgrade redivis`
import redivis

dataset = redivis.user('datapages').dataset('item_response_warehouse') # connect to IRW
df = dataset.table('4thgrade_math_sirt').to_pandas_dataframe() # download data

How to use the Redivis Client Libraries

There are two main ways to use the Redivis client libraries:

Use a Redivis Notebook

Redivis notebooks come preloaded with the latest library – no installation or authentication required. Ideal for first-time users or lightweight workflows.
We also provide some example workflows with IRW in Redivis notebooks here.

Use in Other Environments (e.g., RStudio, Jupyter, Colab, etc.)

Requires installing the appropriate client library (see example code above for installation)
You will need to authenticate with your Redivis account (First-time use will prompt browser login for OAuth), or you may also use API tokens for authentication for long-running jobs (see here for more information about how to generate and set up your API token).

For more detailed setup and usage examples, see the full Redivis R and Python client documentation here:

Analysis of IRW data

We next provide some examples for working with IRW data. The below code blocks import multiple datasets from the IRW and compute some simple metadata (e.g., the number of responses). This should be a useful starting point for conducting your own analyses of the data.

Code

library(irw)
library(dplyr)
library(purrr)


compute_metadata <- function(df) {
  df <- df |> filter(!is.na(resp)) |> mutate(resp = as.numeric(resp))
  tibble(
    n_responses = nrow(df),
    n_categories = n_distinct(df$resp),
    n_participants = n_distinct(df$id),
    n_items = n_distinct(df$item),
    responses_per_participant = n_responses / n_participants,
    responses_per_item = n_responses / n_items,
    density = (sqrt(n_responses) / n_participants) * (sqrt(n_responses) / n_items)
  )
}

dataset_names <- c("4thgrade_math_sirt", "chess_lnirt", "dd_rotation")
tables<-irw::irw_fetch(dataset_names)
summaries_list <- lapply(tables,compute_metadata)
summaries <- bind_rows(summaries_list)
summaries<-cbind(table=dataset_names,summaries)
summaries

table	n_responses	n_categories	n_participants	n_items	responses_per_participant	responses_per_item	density
4thgrade_math_sirt	19920	2	664	30	30.000000	664.0	1.0000000
chess_lnirt	10240	2	256	40	40.000000	256.0	1.0000000
dd_rotation	1178	2	121	10	9.735537	117.8	0.9735537

Code

import pandas as pd
from math import sqrt
import redivis

dataset_names = ["4thgrade_math_sirt", "chess_lnirt", "dd_rotation"]

def compute_metadata(df):
    df = (df
          .loc[~df['resp'].isna()]
          .assign(resp=pd.to_numeric(df['resp']))
         )
    
    return pd.DataFrame({
        'n_responses': [len(df)],
        'n_categories': [df['resp'].nunique()],
        'n_participants': [df['id'].nunique()],
        'n_items': [df['item'].nunique()],
        'responses_per_participant': [len(df) / df['id'].nunique()],
        'responses_per_item': [len(df) / df['item'].nunique()],
        'density': [(sqrt(len(df)) / df['id'].nunique()) * (sqrt(len(df)) / df['item'].nunique())]
    })

dataset = redivis.user('datapages').dataset('item_response_warehouse')
def get_data_summary(dataset_name):
  df = pd.DataFrame(dataset.table(dataset_name).to_pandas_dataframe())
    
  summary = compute_metadata(df)
  summary.insert(0, 'dataset_name', dataset_name)
  return summary

summaries_list = [get_data_summary(name) for name in dataset_names]
summaries = pd.concat(summaries_list, ignore_index=True)
print(summaries)

Reformatting IRW data for use with other packages

Here is a slightly more complex example that takes advantage of irw to easily fetch a dataset and to then compute the InterModel Vigorish contrasting predictings for the 2PL to predictions from the 1PL for an example dataset (using cross-validation across 4 folds; see also the documentation in the related imv package). Note the irw_long2resp function which is helpful for reformatting IRW data from long to wide.

Code

df<-irw::irw_fetch("gilbert_meta_2")  #https://github.com/itemresponsewarehouse/Rpkg
resp<-irw::irw_long2resp(df)
resp$id<-NULL
##1pl/Rasch model
m0<-mirt::mirt(resp,1,'Rasch',verbose=FALSE)
##2pl
ni<-ncol(resp)
s<-paste("F=1-",ni,"
             PRIOR = (1-",ni,", a1, lnorm, 0.0, 1.0)",sep="")
model<-mirt::mirt.model(s)
m1<-mirt::mirt(resp,model,itemtype=rep("2PL",ni),method="EM",technical=list(NCYCLES=10000),verbose=FALSE)
##compute IMV comparing predictions from 1pl and 2pl
set.seed(8675309)
omega<-imv::imv.mirt(m0,m1)
mean(omega)

[1] 0.01276902

:::::

Option 1: Browse and Download from the Redivis Web Interface

Option 2: Use the irw R Package (Recommended)

Setup & Example Usage

Option 3: Use Redivis Client Libraries (R or Python)

Example access to IRW with Redivis Client Libraries:

How to use the Redivis Client Libraries

Analysis of IRW data

A first analysis

Reformatting IRW data for use with other packages

Option 2: Use the `irw` R Package (Recommended)