A two-step path from catalog feasibility to governed custom oncology data delivery.
Start with catalog-level feasibility, then move to a Joint Research Agreement once the indication and scope are clear.
Check patient counts, hospitals, and biomarker availability. No Joint Research Agreement is required at this stage.
A four-party agreement among the institution, Sumitomo, the RWD partner, and the pharma client. It enables custom clinical and molecular data collection.
How hospital registry and EMR data become aggregate reports and de-identified client datasets.
Cancer registry combined with EMR records.
Build the hospital database, query target patients, and supplement missing fields.
Aggregate reports and research-ready raw data prepared.
Reports and curated data delivered through a controlled server.
Client receives agreed deliverables for analysis and regulatory use.
Pseudonymized raw patient-level data requires IRB protocol review at each participating institution before release.
EMR IDs support in-hospital review before de-identification or pseudonymization.
| 01 | Import hospital-based cancer registry data. |
|---|---|
| 02 | Automatically import test and medication data for registry patients. |
| 03 | Import biomarker data for those patients using AI. |
| 04 | Build an in-hospital database with EMR IDs linked. |
| 05 | Query the in-hospital database to extract target patients. |
| 06 | Verify missing data in the EMR using the EMR IDs of target patients. |
| 07 | Add EMR information to resolve missing fields. |
| 08 | Query the Sumitomo DB and send aggregate reports to the central server. |
| 09 | Send raw data after replacing EMR ID with case ID where required. |
| 10 | Download aggregate reports and raw data from the central server. |
| 11 | Deliver aggregate reports and raw data to the client. |
Step 1 answers feasibility quickly. Step 2 unlocks deeper clinical and molecular curation.
Limited fields, sufficient to confirm fit before contracting.
Delivered through the hospital, Sumitomo, RWD partner, and pharma client.
Step 1 uses limited reference fields. Step 2 adds deeper custom collection after a JRA.
Catalog fields from cancer registry data plus EMR-extracted medications, labs, and biomarkers.
| Domain / file | Representative fields / examples | Source / route |
|---|---|---|
| Patient foundation | Patient ID, Episode ID, patient name, gender, date of birth | Medical record number, duplicate number, identity, gender, date of birth |
| Cancer type | Date of diagnosis, cancer type classification, position, detailed part, laterality | Date of diagnosis; primary site localization code; primary root site; laterality |
| Cancer type | Differentiation, organization, pathological diagnosis name | Pathology diagnosis morphological code and histological text |
| Stage | UICC version, cT/cN/cM/cStage, pT/pN/pM/pStage | UICC Ver. 8 and TNM classification fields |
| Treatment methods | Date of start of treatment, treatment methods, lines of treatment | Surgery, endoscopic treatment, radiotherapy, chemotherapy, endocrine therapy; line is primary only if chemotherapy is included |
| Outcomes | Outcomes, last survival confirmation date, date of death | Survival status, last confirmed survival date, death date |
| Direct EMR extraction | Medication, laboratory test, biomarker test result data | Daily medication and lab updates; biomarker test result data |
JRA-stage files collected and curated for the agreed research question.
| Domain / file | Representative fields / examples | Source / route |
|---|---|---|
| cow_consent.csv | Case management number, consent type, consent status, consent date, representative | Consent and withdrawal records |
| cow_patient.csv | Case management number, gender, date of birth, age | Patient demographics |
| cow_patient_background.csv | Family history, ECOG PS, ECOG PS evaluation date, smoking status, smoking years, cigarettes per day, Brinkman Index | Manual curation and EMR review |
| cow_comorbidity.csv / cow_medical_history.csv | Disease name, age at onset, other disease name | Comorbidity and medical history review |
| cow_cancer.csv | Date of diagnosis, pathology report date, cancer type classification, OncoTree code and version | Cancer diagnosis and pathology records |
| cow_therapeutic.csv | Treatment content ID, start/end dates, treatment method, treatment details, best overall response | Treatment history curation |
| cow_therapeutic_medicine.csv | Brand name, generic name, drug code, YJ code, HOT9 code, NHI drug price standard code | Prescription and injection orders |
| cow_recist.csv | Outcome assessment ID, overall response, assessment date, sum of diameters, test type | Imaging and response assessment |
| cow_reaction_data.csv | Adverse event ID, observation/onset/resolution/action dates, causality, English and Japanese AE names | Adverse event curation |
| cow_laboratory.csv | Test date, original test item code, test item name, value and unit, including CYFRA, CEA, CA19-9, neutrophils, lymphocytes, AST, ALT, sodium, potassium, creatinine/eGFR, CK | Laboratory test records |
| cow_outcome.csv | Outcome, last survival confirmation date, date of death, cause of death | Outcome and survival follow-up |
Both datasets use research-grade governance for regulatory use and patient privacy.
Approved research agreements support regulatory and external-control use.
Patient-level data requires a JRA under Japanese research governance.
Catalog queries show population availability before custom curation.
Raw data is de-identified or pseudonymized, with sharing governed by JRA and IRB review.
Send the indication, biomarkers, and study question. We return patient counts, hospital coverage, and biomarker availability.
Request Feasibility Query