Getting Started

IRW datasets are hosted on Redivis, a platform for academic data sharing and analysis. There are a few different ways you can access IRW data—whether you’re just browsing, downloading tables, or working programmatically in R or Python.

NOTE: While you can explore datasets without logging in, a Redivis account is required to download data or use programmatic tools. You can create an account here.

The sections below walk through each access option and include example code to help you get started.

Option 1: Browse and Download from the Redivis Web Interface

If you’re just looking to explore or manually download individual datasets, you can:

This is the simplest way to get started if you don’t need programmatic access.

Option 3: Use Redivis Client Libraries (R or Python)

If you prefer working outside of R or want low-level access to Redivis features, you can use the official Redivis client libraries. These are available for both R and Python.

Example access to IRW with Redivis Client Libraries:

Code
# first install redivis R package: 
devtools::install_github("redivis/redivis-r", ref="main")
library("redivis")

dataset <- redivis::user("datapages")$dataset("item_response_warehouse") # connect to IRW
df <- dataset$table("4thgrade_math_sirt")$to_tibble() # download data
Code
# first install redivis Python package with `pip install --upgrade redivis`
import redivis

dataset = redivis.user('datapages').dataset('item_response_warehouse') # connect to IRW
df = dataset.table('4thgrade_math_sirt').to_pandas_dataframe() # download data

How to use the Redivis Client Libraries

There are two main ways to use the Redivis client libraries:

  1. Use a Redivis Notebook
  • Redivis notebooks come preloaded with the latest library – no installation or authentication required. Ideal for first-time users or lightweight workflows.
  • We also provide some example workflows with IRW in Redivis notebooks here.
  1. Use in Other Environments (e.g., RStudio, Jupyter, Colab, etc.)
  • Requires installing the appropriate client library (see example code above for installation)
  • You will need to authenticate with your Redivis account (First-time use will prompt browser login for OAuth), or you may also use API tokens for authentication for long-running jobs (see here for more information about how to generate and set up your API token).

For more detailed setup and usage examples, see the full Redivis R and Python client documentation here:

Analysis of IRW data

We next provide some examples for working with IRW data. The below code blocks import multiple datasets from the IRW and compute some simple metadata (e.g., the number of responses). This should be a useful starting point for conducting your own analyses of the data.

A first analysis

Code
library(irw)
library(dplyr)
library(purrr)


compute_metadata <- function(df) {
  df <- df |> filter(!is.na(resp)) |> mutate(resp = as.numeric(resp))
  tibble(
    n_responses = nrow(df),
    n_categories = n_distinct(df$resp),
    n_participants = n_distinct(df$id),
    n_items = n_distinct(df$item),
    responses_per_participant = n_responses / n_participants,
    responses_per_item = n_responses / n_items,
    density = (sqrt(n_responses) / n_participants) * (sqrt(n_responses) / n_items)
  )
}

dataset_names <- c("4thgrade_math_sirt", "chess_lnirt", "dd_rotation")
tables<-irw::irw_fetch(dataset_names)
summaries_list <- lapply(tables,compute_metadata)
summaries <- bind_rows(summaries_list)
summaries<-cbind(table=dataset_names,summaries)
summaries
table n_responses n_categories n_participants n_items responses_per_participant responses_per_item density
4thgrade_math_sirt 19920 2 664 30 30.000000 664.0 1.0000000
chess_lnirt 10240 2 256 40 40.000000 256.0 1.0000000
dd_rotation 1178 2 121 10 9.735537 117.8 0.9735537
Code
import pandas as pd
from math import sqrt
import redivis

dataset_names = ["4thgrade_math_sirt", "chess_lnirt", "dd_rotation"]

def compute_metadata(df):
    df = (df
          .loc[~df['resp'].isna()]
          .assign(resp=pd.to_numeric(df['resp']))
         )
    
    return pd.DataFrame({
        'n_responses': [len(df)],
        'n_categories': [df['resp'].nunique()],
        'n_participants': [df['id'].nunique()],
        'n_items': [df['item'].nunique()],
        'responses_per_participant': [len(df) / df['id'].nunique()],
        'responses_per_item': [len(df) / df['item'].nunique()],
        'density': [(sqrt(len(df)) / df['id'].nunique()) * (sqrt(len(df)) / df['item'].nunique())]
    })

dataset = redivis.user('datapages').dataset('item_response_warehouse')
def get_data_summary(dataset_name):
  df = pd.DataFrame(dataset.table(dataset_name).to_pandas_dataframe())
    
  summary = compute_metadata(df)
  summary.insert(0, 'dataset_name', dataset_name)
  return summary

summaries_list = [get_data_summary(name) for name in dataset_names]
summaries = pd.concat(summaries_list, ignore_index=True)
print(summaries)

Reformatting IRW data for use with other packages

Here is a slightly more complex example that takes advantage of irw to easily fetch a dataset and to then compute the InterModel Vigorish contrasting predictings for the 2PL to predictions from the 1PL for an example dataset (using cross-validation across 4 folds; see also the documentation in the related imv package). Note the irw_long2resp function which is helpful for reformatting IRW data from long to wide.

Code
df<-irw::irw_fetch("gilbert_meta_2")  #https://github.com/itemresponsewarehouse/Rpkg
resp<-irw::irw_long2resp(df)
resp$id<-NULL
##1pl/Rasch model
m0<-mirt::mirt(resp,1,'Rasch',verbose=FALSE)
##2pl
ni<-ncol(resp)
s<-paste("F=1-",ni,"
             PRIOR = (1-",ni,", a1, lnorm, 0.0, 1.0)",sep="")
model<-mirt::mirt.model(s)
m1<-mirt::mirt(resp,model,itemtype=rep("2PL",ni),method="EM",technical=list(NCYCLES=10000),verbose=FALSE)
##compute IMV comparing predictions from 1pl and 2pl
set.seed(8675309)
omega<-imv::imv.mirt(m0,m1)
mean(omega)
[1] 0.01276902

:::::