This chapter corresponds to code in the researcherRAG repository
View CLI Source
๐Ÿ“Š

Statistical Extraction

Systematically extract quantitative data from your literature database. Essential for meta-analysis preparation and creating comprehensive data synthesis tables.

๐Ÿ“‹ When to Use This Prompt

  • โœ“ Preparing data for meta-analysis
  • โœ“ Creating comparison tables across studies
  • โœ“ Extracting effect sizes and sample sizes
  • โœ“ Identifying measurement tools used

Prompt Template

Copy this prompt and customize the [outcome] variable:

Extract from all RCT studies:
1. Measurement tools used for [outcome] assessment
2. Effect sizes (Cohen's d or similar)
3. Sample sizes (intervention and control groups)
4. Organize in a table format

For missing values, indicate "Not reported".

Example Usage

Extract from all RCT studies:
1. Measurement tools used for speaking proficiency assessment
2. Effect sizes (Cohen's d or similar)
3. Sample sizes (intervention and control groups)
4. Organize in a table format

For missing values, indicate "Not reported".

Expected Output Structure

## Statistical Extraction: Speaking Proficiency RCT Studies

| Study | Year | N (Int) | N (Ctrl) | Measurement Tool | Effect Size (d) | 95% CI | p-value | Notes |
|-------|------|---------|----------|------------------|-----------------|--------|---------|-------|
| Smith et al. | 2023 | 60 | 60 | TOEFL Speaking | 0.87 | [0.52, 1.22] | <.001 | 12-week intervention |
| Lee & Kim | 2022 | 45 | 43 | IELTS Speaking | 0.91 | [0.48, 1.34] | <.001 | University students |
| Martinez | 2024 | 75 | 75 | OPI (ACTFL) | 0.64 | [0.31, 0.97] | .002 | High school |
| Johnson | 2021 | 55 | 57 | Custom rubric | 0.12 | [-0.25, 0.49] | .23 | Not significant |
| Park et al. | 2023 | 40 | 38 | TOEFL Speaking | 0.73 | [0.28, 1.18] | .004 | Beginners only |
| Chen | 2022 | 33 | 31 | IELTS Speaking | Not reported | - | <.05 | Only reported p-value |
| Wilson | 2020 | 28 | 26 | Not reported | Not reported | - | .12 | Insufficient data |

### Summary Statistics
- **Total Studies**: 7 RCT studies identified
- **Total Participants**:
  - Intervention: 336 students
  - Control: 330 students
- **Effect Size Range**: d = 0.12 to 0.91
- **Mean Effect Size**: d = 0.65 (moderate to large)
- **Significant Results**: 5/7 studies (71%)

### Measurement Tools Used
1. **TOEFL Speaking Test** (2 studies)
2. **IELTS Speaking Test** (2 studies)
3. **OPI (ACTFL)** (1 study)
4. **Custom rubric** (1 study)
5. **Not reported** (1 study)

### Missing Data Issues
- 2 studies did not report effect sizes (Chen, Wilson)
- 1 study did not report measurement tool (Wilson)
- 3 studies did not report 95% confidence intervals

Customization Options

Extract All Study Designs (Not Just RCT)
Extract from ALL quantitative studies (RCT, quasi-experimental, correlational):
1. Study design (RCT, quasi, correlational)
2. Measurement tools for [outcome]
3. Effect sizes or correlation coefficients
4. Sample sizes
5. Organize in a table

Separate tables by study design.
Multi-Outcome Extraction
Extract from RCT studies, for EACH outcome:
- Speaking proficiency
- Motivation
- Anxiety

Create separate tables for each outcome with:
1. Measurement tool
2. Effect size
3. Sample size
4. p-value
Detailed Statistical Information
Extract comprehensive statistical data from RCT studies:
1. Descriptive statistics (M, SD) for pre/post intervention and control
2. Effect sizes (Cohen's d, Hedges' g, partial ฮทยฒ)
3. Test statistics (t, F, ฯ‡ยฒ)
4. Sample characteristics (age range, gender %, attrition)
5. Power analysis (if reported)

Format as a detailed table with all available values.
Reliability and Validity Data
For each measurement tool used:
1. Tool name and citation
2. Reliability coefficients (Cronbach's ฮฑ, test-retest)
3. Validity evidence mentioned
4. Number of items/subscales
5. Score range

Create a "Measurement Properties" table.

Common Follow-up Questions

  • Q: "Which studies used the same measurement tool? (for direct comparison)"
  • Q: "What's the weighted average effect size?"
  • Q: "Show me studies with sample sizes > 50 per group"
  • Q: "Which studies reported insufficient statistical details?"
  • Q: "Export this table to CSV format"

Pro Tips

๐Ÿ“‹ Export to Excel

Copy the markdown table and paste into Excel for further analysis. Or ask the AI to format as CSV.

๐Ÿ” Check Missing Data

Always include "For missing values, indicate 'Not reported'" to identify studies with incomplete data.

๐Ÿ“Š Verify Effect Sizes

Spot-check a few effect sizes by reading the original papers. AI may miscalculate if data is ambiguous.

โš–๏ธ Note Heterogeneity

If effect sizes vary widely, ask follow-up questions about moderators (age, context, duration, etc.)

Meta-Analysis Workflow

Use this prompt as part of a systematic meta-analysis pipeline:

  1. Step 1: Use this prompt to extract effect sizes and sample sizes
  2. Step 2: Export table to Excel/R/STATA
  3. Step 3: Calculate weighted average effect size
  4. Step 4: Test for heterogeneity (Iยฒ, Q statistic)
  5. Step 5: Conduct moderator analysis if heterogeneous
  6. Step 6: Check publication bias (funnel plot, Egger's test)