DOI-pinned · checksum-verified · pooch-style on-demand

A reproducible bank of raw NIRS reference datasets

Curated near-infrared spectroscopy datasets — provenance-rich, license-aware, and citable. Each is described by a full identity card; the bytes stay at their licensed origin and are fetched on demand, verified, and cached.

Browse the catalog →
The bank at a glance

Every dataset, measured

164
datasets
310,888
total samples
171
spectral sources
21
domains
4
multi-source
42
public-tier
Whole-bank dataviz

What's inside the bank

Computed once at build time and rendered as inline SVG — no trackers, no runtime chart library.

Wavelength coverage
Spectral range each family covers, in none (74 dataset(s) on another axis unit not shown)
Wavelength coverage by familyotherother: 0.3–25.04 none across 58 dataset(s)0.3–25.04 · 58NIRNIR: 0.4–4,000 none across 24 dataset(s)0.4–4,000 · 24MIRMIR: 102–11,994 none across 5 dataset(s)102–11,994 · 5RamanRaman: 51.43–11,994 none across 3 dataset(s)51.43–11,994 · 305,00010,00015,000wavelength / none · band = min–max coverage, then range · dataset count
Samples × features
Every dataset positioned by sample count and wavelength/feature count
Samples versus features1101001,00010,000100,0001002005001,0002,0005,00010,000ossl afsis1 visnir soil all y · 1,904 samples · 1,076 features · NIR · privateossl afsis2 mir soil all y · 151 samples · 1,701 features · NIR · privateossl afsis2 visnir soil all y · 151 samples · 1,076 features · NIR · privateossl cassl visnir soil all y · 1,578 samples · 1,076 features · NIR · privateossl garrett visnir soil all y · 184 samples · 1,076 features · NIR · privateossl icraf visnir soil all y · 3,776 samples · 1,076 features · NIR · privateossl jovic mir soil all y · 45 samples · 1,701 features · NIR · privateossl jovic visnir soil all y · 135 samples · 1,076 features · NIR · privateossl kellogg visnir soil all y · 85,669 samples · 1,076 features · NIR · privateossl lucas mir soil all y · 40,175 samples · 1,701 features · NIR · privateossl lucas visnir soil all y · 40,175 samples · 1,076 features · NIR · privateossl lucas woodwell mir soil all y · 589 samples · 1,701 features · NIR · privateossl lucas woodwell visnir soil all y · 589 samples · 1,076 features · NIR · privateossl schiedung visnir soil all y · 259 samples · 1,076 features · NIR · privateEcoSIS CABO 2018-2019 Leaf-Level Spectra v2 (absorbance) · 1,971 samples · 2,001 features · NIR · publicEcoSIS Fresh-leaf CABO spectra from herbarium project (reflectance) · 609 samples · 2,001 features · NIR · publicEcoSIS Fresh-leaf CABO spectra from herbarium project v2 (reflectance) · 609 samples · 2,001 features · NIR · publicEcoSIS Pressed-leaf CABO spectra from herbarium project (reflectance) · 614 samples · 2,001 features · NIR · publicEcoSIS Pressed-leaf CABO spectra from herbarium project v2 (reflectance) · 614 samples · 2,001 features · NIR · publicEcoSIS Leaf spectra, structural and biochemical leaf traits of eight crop species (reflectance) · 184 samples · 2,151 features · NIR · publicEcoSIS CABO 2018-2019 Leaf-Level Spectra (absorbance) · 1,971 samples · 2,001 features · NIR · publicEcoSIS CABO 2018-2019 Leaf-Level Spectra (transmittance) · 1,971 samples · 2,001 features · NIR · publicEcoSIS CABO 2018-2019 Leaf-Level Spectra v2 (transmittance) · 1,971 samples · 2,001 features · NIR · publicEcoSIS Dessain project reflectance spectra (reflectance) · 200 samples · 2,001 features · NIR · publicEcoSIS Ground-leaf CABO spectra from herbarium project (reflectance) · 607 samples · 2,001 features · NIR · publicEcoSIS Ground-leaf CABO spectra from herbarium project v2 (reflectance) · 607 samples · 2,001 features · NIR · publicEcoSIS NGEE Arctic Leaf Spectral Reflectance Utqiagvik (Barrow) Alaska 2013 (reflectance) · 69 samples · 2,151 features · NIR · privateGrapevine LeafTraits multisensor NIR · 2,079 samples · 2,759 features · NIR · privateossl neospectra mir soil all y · 1,976 samples · 1,701 features · NIR · privateossl neospectra nir soil all y · 8,151 samples · 601 features · NIR · privateEcoSIS Common Milkweed Leaf Responses to Water Stress and Elevated Temperature (reflectance) · 735 samples · 2,151 features · NIR · privateEcoSIS Seasonal measurements of photosynthesis and leaf traits in scarlet oak (reflectance) · 48 samples · 2,151 features · NIR · publicEcoSIS Freeze Dried Leaf Spectra and Measured Traits from the Sierra Nevada (CA) in July 2023 (reflectance) · 82 samples · 2,151 features · NIR · publicEcoSIS Oven Dried Leaf Spectra and Measured Traits from the Sierra Nevada (CA) in July 2023 (reflectance) · 83 samples · 2,151 features · NIR · publicChEMBL IR Raman multiblock · 50,000 samples · 1,024 features · mixed · publicEcoSIS Fresh Leaf Spectra and Measured Traits from the Sierra Nevada (CA) in July 2023 (reflectance) · 35 samples · 2,151 features · NIR · publicEcoSIS 2008 University of Wisconsin Biotron Fresh Leaf Spectra and Gas Exchange Leaf Traits (reflectance) · 87 samples · 2,151 features · NIR · publicEcoSIS Leaf and canopy spectroscopy and biochemical data of field-grown Cucurbita pepo under two stresses (reflectance) · 541 samples · 2,151 features · NIR · publicEcoSIS Leaf reflectance and traits of plants sampled along a water affinity gradient (AQGRAD) (reflectance) · 190 samples · 2,101 features · NIR · privateEcoSIS Leaf reflectance and tratis of floating and emergent macrophytes (reflectance) · 325 samples · 1,024 features · NIR · privateEcoSIS NASA FFT Project Leaf Transmittance Morphology and Biochemistry for Northern Temperate Forests (transmittance) · 765 samples · 2,151 features · NIR · publicEcoSIS Productivity and Characterization of Soybean Foliar Traits Under Aphid Pressure (reflectance) · 1,131 samples · 2,151 features · NIR · publicCartilage spectroscopy Scientific Data NIR · 869 samples · 812 features · NIR · publicEcoSIS Intact- and ground-leaf litter spectra from Cedar Creek and Minneapolis (reflectance) · 322 samples · 2,001 features · NIR · publicEcoSIS NGEE Tropics February 2017 Leaf Spectral Reflectance Measured in Panama at the PA-SLZ Canopy Crane (reflectance) · 249 samples · 2,151 features · NIR · publicEcoSIS 2014 Cedar Creek ESR Grassland Biodiversity Experiment: Leaf-level Contact Data: Trait Predictions (reflectance) · 831 samples · 2,101 features · NIR · privateEcoSIS Spectra from in situ deciduous leaves and leaves collected for nitrogen analysis throughout autumn (reflectance) · 1,013 samples · 781 features · NIR · privateEcoSIS NGEE Arctic Leaf Spectral Reflectance and Transmittance Data 2014 to 2016 Utqiagvik (Barrow) Alaska (reflectance) · 199 samples · 2,151 features · NIR · privateEcoSIS 2018 Cedar Creek pressed leaves (reflectance) · 332 samples · 2,001 features · NIR · publicEcoSIS NASA FFT Project Leaf Reflectance Morphology and Biochemistry for Northern Temperate Forests (reflectance) · 1,382 samples · 2,151 features · NIR · publicEcoSIS NGEE Tropics GLiHT Puerto Rico Campaign Leaf Spectral Reflectance and Transmittance March 2017 (reflectance) · 302 samples · 2,151 features · NIR · publicEcoSIS Tabletop leaf drydowns to relate leaf spectra and leaf water (Santa Barbara, CA) (reflectance) · 48 samples · 2,151 features · NIR · publicEcoSIS NGEE Arctic 2017 Leaf Spectral Reflectance Teller Watershed Seward Peninsula Alaska (reflectance) · 163 samples · 2,151 features · NIR · publicEcoSIS NGEE Tropics GLiHT Puerto Rico Campaign Leaf Spectral Reflectance and Transmittance March 2017 (transmittance) · 222 samples · 2,151 features · NIR · publicMANURE21 NIR all chemistry targets · 490 samples · 1,003 features · NIR · privateDiesel fuels NIR Eigenvector · 784 samples · 401 features · NIR · privateEcoSIS 2018 Talladega National Forest: Leaf level Reflectance Spectra and Foliar Traits (reflectance) · 156 samples · 992 features · NIR · privateEcoSIS Purdue Leaf Spectral and Functional Trait Data used in PLSR modeling v2 (reflectance) · 987 samples · 2,151 features · NIR · privateECOSTRESS mineral tir axis 02866850 · 125 samples · 2,287 features · other · privateOpenSpecy FTIR spectral library subset · 5,000 samples · 1,983 features · MIR · privateOpenSpecy RAMAN spectral library subset · 5,000 samples · 1,983 features · Raman · privateEcoSIS Leaf spectra of 4 plant species from Belgian dune grasslands + Rosa rugosa from the Northern Japan (reflectance) · 2,399 samples · 2,051 features · NIR · publicECOSTRESS mineral tir axis 02866850 · 113 samples · 2,287 features · other · privateECOSTRESS mineral tir axis d3032c60 · 10 samples · 2,287 features · other · privateECOSTRESS rock all axis 5431484a · 2 samples · 2,844 features · other · privateECOSTRESS rock all axis 1fb6fa59 · 46 samples · 2,844 features · other · privateECOSTRESS rock all axis 2228baf8 · 8 samples · 2,844 features · other · privateECOSTRESS rock all axis be345a03 · 39 samples · 2,868 features · other · privateRRUFF IR mineral spectral library common-axis subset · 347 samples · 1,868 features · MIR · privateRRUFF RAMAN mineral spectral library common-axis subset · 85 samples · 1,340 features · Raman · privateCGL_NIR grain protein Eigenvector · 231 samples · 117 features · NIR · privateCORN Eigenvector NIR · 80 samples · 2,100 features · NIR · privateEcoSIS FAB Leaf Spectra Across a Light Gradient at Cedar Creek LTER (reflectance) · 138 samples · 2,151 features · NIR · publicEcoSIS Fine-scale VNIR hyperspectral canopy reflectances from Virginia successional forests (reflectance) · 2,850 samples · 226 features · NIR · privateEcoSIS Greater Cape Floristic Region Leaf Spectral Library (reflectance) · 3,205 samples · 500 features · NIR · publicECOSTRESS mineral tir axis 85fbc8f6 · 150 samples · 2,256 features · other · privateECOSTRESS mineral vswir axis 61f98690 · 148 samples · 2,101 features · other · privateECOSTRESS mineral vswir axis 158dfad5 · 160 samples · 826 features · other · privateECOSTRESS rock all axis 20b176d4 · 27 samples · 2,231 features · other · privateECOSTRESS rock all axis d24e4e1f · 28 samples · 2,868 features · other · privateECOSTRESS water all axis 21cbe3b0 · 4 samples · 965 features · other · privatePNNL/NIST Quantitative Infrared Database private-use subset · 20 samples · 3,527 features · MIR · privateEcoSIS 3D LMA Canopy Level Spectra (reflectance) · 59 samples · 351 features · NIR · privateEcoSIS Calcareous grassland species over growing season at the leaf level (reflectance) · 1,100 samples · 2,151 features · NIR · privateEcoSIS Canopy spectra of boreal tree species from Alberta potted tree experiment (reflectance) · 2,550 samples · 781 features · NIR · publicEcoSIS Leaf spectra of boreal tree species from Alberta potted tree experiment (reflectance) · 3,235 samples · 781 features · NIR · publicEcoSIS NGEE Tropics Leaf Spectral Reflectance Measured in Panama Collected February to April 2016 (reflectance) · 708 samples · 2,151 features · NIR · publicECOSTRESS manmade all axis 21e24555 · 14 samples · 491 features · other · privateECOSTRESS manmade all axis ec3e1c20 · 22 samples · 536 features · other · privateECOSTRESS manmade all axis 0d1ca66e · 6 samples · 551 features · other · privateECOSTRESS manmade all axis 4a5262f7 · 3 samples · 561 features · other · privateECOSTRESS meteorites tir axis d3032c60 · 55 samples · 2,287 features · other · privateECOSTRESS mineral all axis adc9f614 · 17 samples · 2,752 features · other · privateECOSTRESS mineral tir axis cbf25a1b · 2 samples · 2,287 features · other · privateECOSTRESS nonphotosyntheticvegetation tir axis 1389b1f1 · 7 samples · 1,737 features · other · privateECOSTRESS nonphotosyntheticvegetation vswir axis 4d4366d1 · 54 samples · 2,151 features · other · privateECOSTRESS rock all axis 8a8a144c · 83 samples · 2,257 features · other · privateECOSTRESS rock all axis aa24fdf9 · 30 samples · 2,530 features · other · privateECOSTRESS rock all axis 8cf1b56d · 35 samples · 2,826 features · other · privateECOSTRESS rock all axis e7e7baa6 · 14 samples · 2,868 features · other · privateECOSTRESS rock all axis 8e148055 · 4 samples · 2,868 features · other · privateECOSTRESS rock all axis e2c8c295 · 4 samples · 2,868 features · other · privateECOSTRESS rock tir axis 16aa20e0 · 5 samples · 2,148 features · other · privateECOSTRESS rock vswir axis 61f98690 · 83 samples · 2,101 features · other · privateECOSTRESS soil all axis 1fb6fa59 · 13 samples · 2,844 features · other · privateECOSTRESS soil all axis 5431484a · 3 samples · 2,844 features · other · privateECOSTRESS soil all axis 2228baf8 · 7 samples · 2,844 features · other · privateECOSTRESS soil all axis be345a03 · 4 samples · 2,868 features · other · privateECOSTRESS soil all axis d24e4e1f · 6 samples · 2,868 features · other · privateECOSTRESS vegetation tir axis 1389b1f1 · 324 samples · 1,737 features · other · privateECOSTRESS vegetation vswir axis 4d4366d1 · 210 samples · 2,151 features · other · privateECOSTRESS vegetation vswir axis 4d4366d1 · 325 samples · 2,151 features · other · privateFlanagan API compounds Raman · 3,510 samples · 3,276 features · Raman · privatePharmaceutical tablets NIR Shootout Eigenvector · 655 samples · 1,300 features · NIR · privateEcoSIS 2019 PLOSONE wheat hessian fly ms (reflectance) · 74 samples · 2,151 features · NIR · privateEcoSIS Hyperspectral leaf reflectance, biochemistry, and physiology of droughted and watered crops (reflectance) · 118 samples · 2,151 features · NIR · publicEcoSIS NGEE Arctic 2017 Canopy Spectral Reflectance Seward Peninsula Alaska (reflectance) · 511 samples · 2,151 features · NIR · publicEcoSIS UW-BNL NASA HyspIRI Airborne Campaign Leaf and Canopy Spectra and Trait Data (reflectance) · 2,415 samples · 2,151 features · NIR · publicECOSTRESS lunar tir axis 69ac2056 · 17 samples · 2,124 features · other · privateECOSTRESS mineral all axis 20b176d4 · 4 samples · 2,231 features · other · privateECOSTRESS mineral tir axis 02866850 · 2 samples · 2,287 features · other · privateECOSTRESS mineral tir axis d3032c60 · 5 samples · 2,287 features · other · privateECOSTRESS nonphotosyntheticvegetation tir axis d3f7b526 · 47 samples · 1,736 features · other · privateECOSTRESS rock all axis d9555baa · 9 samples · 2,227 features · other · privateECOSTRESS rock all axis 4cb30554 · 29 samples · 2,844 features · other · privateECOSTRESS rock all axis 5a744ba8 · 4 samples · 2,844 features · other · privateECOSTRESS soil all axis 7258ef46 · 2 samples · 2,844 features · other · privateECOSTRESS soil all axis e7e7baa6 · 2 samples · 2,868 features · other · privateECOSTRESS soil tir axis 69ac2056 · 17 samples · 2,124 features · other · privateECOSTRESS soil tir axis 2895e351 · 11 samples · 2,223 features · other · privateECOSTRESS vegetation all axis 6fbcd0b0 · 3 samples · 550 features · other · privateECOSTRESS vegetation tir axis 8b6bc3b9 · 138 samples · 1,737 features · other · privatePerten cereals NIR · 450 samples · 141 features · NIR · privateUCPH tablet NIR · 310 samples · 404 features · NIR · privateEcoSIS 3D LMA Leaf Level Spectra (reflectance) · 1,485 samples · 2,101 features · NIR · privateEcoSIS Fresh Leaf Spectra to Estimate Leaf Morphology and Biochemistry for Northern Temperate Forests (reflectance) · 2,363 samples · 2,151 features · NIR · privateEcoSIS NGEE Arctic Leaf Spectral Reflectance and Transmittance Data 2014 to 2016 Utqiagvik (Barrow) Alaska (transmittance) · 31 samples · 2,151 features · NIR · privateEcoSIS Spectral Characterization of Multiple Corn Varieties: West Madison Agricultural Station 2014 (reflectance) · 288 samples · 2,151 features · NIR · privateEcoSIS Urban Materials Spectral Library reflectance · 60 samples · 256 features · NIR · privateEcoSIS Urban Reflectance Spectra from Santa Barbara, CA (reflectance) · 1,065 samples · 1,075 features · NIR · privateEcoSIS UW-BNL NASA HyspIRI California Airborne Campaign Ground Target Spectra (reflectance) · 64 samples · 2,151 features · NIR · publicEcoSIS Varietal Discrimination and Detection of PVY in Solanum tuberosum: Hawaii 2014 (reflectance) · 761 samples · 2,151 features · NIR · privateECOSTRESS manmade tir axis 93ae36e3 · 10 samples · 2,223 features · other · privateECOSTRESS manmade tir axis 85fbc8f6 · 16 samples · 2,256 features · other · privateECOSTRESS nonphotosyntheticvegetation tir axis d3f7b526 · 4 samples · 1,736 features · other · privateECOSTRESS nonphotosyntheticvegetation vswir axis 4d4366d1 · 4 samples · 2,151 features · other · privateECOSTRESS rock all axis 66af2adc · 2 samples · 2,868 features · other · privateECOSTRESS vegetation tir axis d3f7b526 · 4 samples · 1,736 features · other · privateMalaria Anopheles gambiae oocyst NIR · 333 samples · 2,151 features · NIR · publicMalaria Anopheles gambiae sporozoite NIR · 229 samples · 2,151 features · NIR · publicBeer original extract OHPL NIR · 60 samples · 576 features · NIR · privateSoil organic matter OHPL NIR · 108 samples · 700 features · NIR · privateWheat protein OHPL NIR · 100 samples · 701 features · NIR · privateFLOPP-e FTIR polymer classification · 195 samples · 1,869 features · MIR · privateFLOPP FTIR polymer classification · 186 samples · 1,869 features · MIR · privateboeuf_classe_adulteration_ts · 60 samples · 470 features · NIR · privatecafe_instantane_espece_cafe_ts · 56 samples · 286 features · NIR · privatehuile_olive_extra_vierge_origine_geographique_ts · 60 samples · 570 features · NIR · privatepuree_fraise_authenticite_ts · 983 samples · 235 features · NIR · privateroche_type_roche_ts · 70 samples · 2,844 features · NIR · privatesolution_eau_ethanol_bouteille_whisky_concentration_ethanol_ts · 1,572 samples · 1,751 features · NIR · privatespiritueux_whisky_bouteille_taux_ethanol_ts · 1,004 samples · 1,751 features · NIR · privateviande_espece_viande_ts · 120 samples · 448 features · NIR · privatevin_cepage_type_ts · 111 samples · 234 features · NIR · privatesamples per dataset (log scale)features / wavelengths (log scale)low profile riskmidhigh164 datasets · colour = profile risk · size = #targetsmore targets
Property profiles
Highest anomaly/heterogeneity profiles in the bank (0 center, 1 outer ring)
Dataset property profiles0.250.50.751integritynoiseartefactsbaselinePCA outliersreferencerepeatabilitystructureECOSTRESS meteorites tir axis d3032c60 profileECOSTRESS mineral tir axis 02866850 profileOpenSpecy FTIR spectral library subset profileOpenSpecy RAMAN spectral library subset profileFLOPP-e FTIR polymer classification profileossl kellogg visnir soil all y profileintegrity: 1.00noise: 0.00artefacts: 1.00baseline: 1.00PCA outliers: 1.00reference: 1.00repeatability: 0.00structure: 1.00ossl kellogg vi…FLOPP-e FTIR po…OpenSpecy RAMAN…OpenSpecy FTIR …ECOSTRESS miner…ECOSTRESS meteo…0 center · 1 outer ring · outward = stronger anomaly / heterogeneity signal
Datasets by domain
Application domains across the bank
Datasets by domainecosisecosis: 5858ecostressecostress: 5858osslossl: 1616timeseriestimeseries: 99ohplohpl: 33malariamalaria: 22openspecyopenspecy: 22plasticplastic: 22rruffrruff: 22cartilagecartilage: 11cglcgl: 11chemblchembl: 11corncorn: 11dieseldiesel: 11+7 more+7 more: 77
Spectroscopy family
Single-modality family per dataset (mixed when heterogeneous)
Datasets by spectroscopy familyNIR: 97 (59%)other: 58 (35%)MIR: 5 (3%)Raman: 3 (2%)mixed: 1 (1%)164TOTALNIR97 · 59%other58 · 35%MIR5 · 3%Raman3 · 2%mixed1 · 1%
Access tier
Governance tier: public / private / anonymized
Datasets by access tierprivate: 122 (74%)public: 42 (26%)164TOTALprivate122 · 74%public42 · 26%
Targets per dataset
How many prediction targets each dataset declares (0 = metadata-only)
Distribution of target counts0501001501 – 5: 1035 – 10: 1310 – 14: 1414 – 18: 218 – 23: 223 – 27: 227 – 31: 531 – 36: 336 – 40: 040 – 44: 044 – 49: 149 – 53: 190204060targets per dataset
Sample-size distribution
#samples per dataset (log-spaced bins) · median 151
Distribution of dataset sample counts01020302 – 4: 94 – 8: 158 – 15: 815 – 29: 829 – 56: 1456 – 109: 18109 – 213: 23213 – 414: 15414 – 806: 18806 – 1,570: 111,570 – 3,058: 143,058 – 5,955: 65,955 – 11,597: 111,597 – 22,587: 022,587 – 43,988: 243,988 – 85,669: 21101001,00010,000100,000samples per dataset
Wavelength-count distribution
#wavelengths per dataset · median 2,101
Distribution of wavelength counts050100117 – 145: 2145 – 179: 0179 – 222: 0222 – 274: 4274 – 339: 1339 – 420: 3420 – 519: 4519 – 642: 7642 – 795: 5795 – 983: 3983 – 1,217: 151,217 – 1,505: 21,505 – 1,862: 131,862 – 2,304: 802,304 – 2,851: 142,851 – 3,527: 111002005001,0002,0005,00010,000wavelengths per dataset
License mix
SPDX license per dataset across the bank
License mixLicenseRef-not-clearedLicenseRef-not-cleared: 122122CC-BY-4.0CC-BY-4.0: 1818CC-BY-SA-4.0CC-BY-SA-4.0: 1111ODC-By-1.0ODC-By-1.0: 77ODbL-1.0ODbL-1.0: 55PDDL-1.0PDDL-1.0: 11
Origin kinds
Where the bytes originate (zenodo / dataverse / url / …)
Origin kindsscript: 164 (49%)url: 159 (48%)figshare: 6 (2%)zenodo: 3 (1%)dataverse: 2 (1%)334TOTALscript164 · 49%url159 · 48%figshare6 · 2%zenodo3 · 1%dataverse2 · 1%
Use it in one line

One call, fully reproducible

Datasets download on demand by their pinned DOI, are checksum-verified, and cached locally.

# pip install nirs4all-datasets
from nirs4all_datasets import get

ds = get("cartilage_spectroscopy_scientificdata_nir")            # DOI-pinned, checksum-verified, cached
X, y = ds.x(), ds.y()
print(X.shape, y.shape)

Public datasets fetch openly; private/anonymized datasets need a Dataverse token. The bytes always live at their licensed origin — this catalog never redistributes them.

Provenance & citation

Built to be cited

Each dataset carries its DataCite provenance, origin sources, and publication DOIs. Cite the dataset DOI and its origin publications; respect each dataset's own license.

Open the full catalog →