Formedix Blog
Stay up to date with the latest clinical trials news and developments.
-
All you need to know about NCI, CDISC and SDTM controlled terminology
The use of controlled terminology (CT) is vital to successful clinical study build, however, understanding the concept and its various subsets can be a challenge.
-
What should a clinical metadata repository do? | Formedix
As the clinical research industry moves towards more efficient and automated processes, clinical metadata repositories are becoming a must. You’ve probably heard of them; sometimes they are referred to as ‘clinical MDRs’ or simply ‘CMDRs’. But you might not know exactly what they do and how they can make your clinical study build more efficient.
In this blog, we answer the question ‘What should a clinical metadata repository do?’ and explore the different functionalities of a clinical metadata repository. Armed with this information, you’ll be better placed to decide whether a CMDR will work for your organization, and how to ensure you pick the best platform for your study requirements.
The importance of a clinical metadata repository
When clinical trial metadata is not managed efficiently, it can be unreliable, inconsistent, and outdated. Typically, metadata is stored over multiple locations, and is non-standardized. It can be difficult to find the correct forms to use, or the latest, approved internal standards to adhere to when collecting data. A clinical metadata repository can help overcome these problems.
What exactly is a clinical metadata repository?
A clinical metadata repository is a ‘single source of truth’: a centralized, reliable library of metadata assets such as forms, terminologies and datasets that are ready to use and reuse when you need them. By implementing a comprehensive clinical metadata repository, organizations can streamline their metadata management, improve submission data quality, get earlier data insights, and accelerate new drugs to market.
What do we mean by ‘metadata’ in clinical trials?
The term ‘metadata’ in clinical trials refers to the content that forms the building blocks of your clinical study, for example forms, annotations, terminologies, datasets, mapping and files. Metadata is ‘data that describes data’. An example of metadata within a dataset would be a ‘variable’. A ‘variable’ describes a particular measurement (e.g. height, age, weight) and how it should be recorded. It will allow for a range of possible measurement values within the data itself. The metadata is a description of the data (e.g. is it numeric or text, what are the possible values? etc.) while the data is the chosen value.
What are some of the key benefits of implementing a clinical metadata repository?
* Better data quality
A key benefit to using a clinical metadata repository is the ability to improve data quality. By standardizing the way you collect clinical data up front (before any data is actually collected), you can ensure that you collect the right data, in the right format, giving you confidence that your submission to the regulator will be successful.
* Earlier decision making
By having access to standardized metadata from the get-go, healthcare professionals can easily access and retrieve relevant clinical data for research, analysis, and reporting purposes. They can make crucial decisions about the safety and efficacy of a drug more quickly, ultimately improving patient outcomes.
* Easier collaboration
Efficient metadata management plays a crucial role in the process of drug discovery. With the increasing volume and complexity of clinical trial data, it is essential to have a centralized repository that enables you to share, reuse and standardize metadata. Standardization of metadata in clinical trials ensures consistency and accuracy in data interpretation, enabling easier collaboration among different stakeholders. It also allows you to compare data for different studies more easily.
We’ve touched on some key benefits here, but there are many more ways that clinical metadata repositories are crucial to building effective clinical trials.
Considerations for your clinical metadata repository
It’s not enough to simply find a clinical metadata repository and implement it. Different platforms have different capabilities, and not every offering will meet your requirements or enable you to maximize your clinical metadata management.
If the platform you’re looking at doesn’t meet some basic specifications, you could be wasting your time and money. So, what should you expect when it comes to choosing a clinical metadata repository?
What are the basic functionalities of a clinical metadata repository?
The following functionalities really are a ‘no-brainer’. A clinical metadata repository should as a minimum:
* Give you easy access to CDISC-compliant templates to make study build faster and easier CDISC compliant
* Stay up to date with new versions of the standards, and support older versions
* Let you reuse your standards
* Have versioning in place
* Have built-in traceability
* Let you find your metadata easily and quickly
What are some additional features of a clinical metadata repository?
Beyond the list above, there are some additional features that could really make a difference to the performance of your clinical trials. We’ve identified four additional considerations for building successful clinical trials.
1. Visibility
It’s really important to understand how content is used in your clinical trials. A metadata repository should provide clear visibility of:
* where your metadata is used
* how it’s used
* how often it’s used
* the impact of changing it (I know you mention this below but I think it should still be in the list
That way, when you have lots of change requests, you can easily see which ones you need to focus on.
If the content needs to be changed, you need to be able to see what the impact of those changes will be, so you can decide the appropriate way forward. For example, perhaps you’re thinking of making a change to a standard that’s used in many studies. Implementing this change will mean making lots of changes to lots of studies. That’s quite a bit of work that’ll take considerable time and resources. So, you need to know if making the change is worth it.
Do the benefits of making the change outweigh the work and resources needed to make it happen? If the answer is ‘yes’, then it’s worth making the change. That’s why it’s important that your metadata repository gives you the visibility to see the impact of that change before you make the change.
2. Governance
Metadata needs to be governed to ensure it doesn’t become out of date or invalid. Neglected metadata can lead to poor quality or incorrect data collection or processing. Plus, it probably doesn’t meet the needs of the stakeholders involved. Your metadata, then, needs to be subject to a governance lifecycle.
Data governance refers to the processes and policies that govern the management, quality, and security of clinical data. It includes establishing data standards, defining data ownership and stewardship, and implementing data access controls. Effective data governance ensures that the clinical metadata repository is maintained and utilized in a consistent and secure manner.
Here’s a simple example of a governance lifecycle:
Governance is one of the most important features of a metadata repository. It helps you:
* Improve data quality
* Control and understand the workflow of content
* Assess the impact of change
* Develop robust organizational standards
Note
With good governance, you can trust your metadata is accurate. And if your metadata is accurate, your data will be accurate too.
If you can see the detailed lifecycle of a standard, there’s total transparency in your team, and you’ll avoid regulatory non-compliance.
3. Collaboration
Collaboration allows a team of people, regardless of their role or location, to work together easily. For example, a group of people with different roles maintaining a set of global standards need to be able to collaborate on those standards without worrying about poor document control or complex approvals processes.
From Data Managers to Biostatisticians, each person involved in clinical trial build brings their own set of requirements. A clinical metadata repository should accommodate the needs of each role while ensuring data transparency and robust processes. That way, stakeholders can easily see the impact of other people’s changes, communication is clear, and there’s less room for misinterpretation. The end result is a better experience for all users. You get much more consistent, higher quality data, faster.
4. EDC integration
A metadata repository should be at the centre of your clinical trial build. It should work with other systems and databases to make your whole organization more efficient and increase data quality.
Every clinical trial is different. For example, you might have to use different electronic data capture (EDC) systems for different studies, or even for different phases of one study. Having the ability to design studies and standards with specific EDC systems in mind, and the freedom to use multiple systems if required, makes building clinical trials much easier.
A clinical metadata repository with EDC integration lets you fully design your Case Report Forms (CRFs), including complex tasks like edit checks and conditions. You can see what your forms look like upfront, before you upload it to the EDC. You can carry out a simple review and approval process and you can generate specifications for review outside the system. Once you’re happy with your CRFs, you can automatically build your EDC system, which saves time and money by removing manual interpretation of your specs. Overall, your studies will be higher quality and more consistent.
You should also consider whether you need other systems to be able to pull metadata from your chosen metadata repository, and push data to it. If your metadata repository has an API, your other systems can talk to it and make it the central hub for knowledge in your organization.
How do you implement a clinical metadata repository?
We’ve explored the different functionalities of a clinical metadata repository, and how some added functionality can elevate your clinical trials and make them far more efficient. We’ve shown you that, by leveraging the power of a centralized clinical metadata repository, you can improve your internal processes, build faster, more efficient clinical studies, and accelerate the delivery of new treatments to market.
Hopefully by now, you’ll have a good idea of how a clinical metadata repository can improve your study build processes, and what to look for in any clinical metadata repository.
Are you ready to implement your clinical metadata repository? Then you’ll want to download our free quick guide to implementing a CMDR! In the guide, we cover:
* Choosing a CMDR that’ll help you achieve efficiencies in study build
* How to ensure stakeholders needs are met during implementation
* The approach you should take when standardizing metadata, to reduce effort and time spent on study set up
-
Why should you use LOINC codes for SDTM? | LOINC for lab tests
In this blog, we take a detailed look at LOINC Codes and SDTM. We look at several LOINC code examples and examine why it’s beneficial to use the codes alongside CDISC and NCI controlled terminology. We also include details of where to find the official LOINC codes list.
What are LOINC Codes used for?
LOINC stands for ‘Logical Observation Identifiers Names and Codes’.
According to the LOINC website, ‘Reference labs, healthcare providers, government agencies, insurance companies, software and device manufacturers, researchers, and consumers from around the globe use LOINC to identify data and move it seamlessly between systems.’
Essentially, LOINC provides a unique identifier for each medical observation – such as lab test. It also provides additional information, such as standardized specimen type, category, etc.
LOINC is an internationally recognized classification system. It is often requested in regulatory data submissions to provide context to clinical measurement data, for example, Labs and ECG (electrocardiogram).
Top tip Central Labs and other data providers may offer LOINC codes with their data transfer. Make sure this is included as a requirement in your Data Transfer Specification (DTS) document (the document that outlines the agreed collection method) as this will ensure the data is LOINC coded at source.
Important information about LOINC codes for SDTM
CDISC and the NCI provide controlled terms for observations such as Lab test names, for exampleLBTESTCD (Laboratory Test Code) and LBTEST (the name of the lab test performed).
So why would another classification system be useful? The simple answer lies in the difference between the classification systems. The CDISC/NCI controlled terms for Lab test are not unique. Instead, they require additional information to differentiate the lab test. But LOINC codes are unique, so only the LOINC code is required to identify the test.
The CDISC knowledge base contains important information about LOINC and SDTM, including:
* While LOINC codes are NOT required by CDISC standards, there are benefits to using them
* LOINC codes have been an FDA requirement since March 2020
* As not all tests have LOINC terms, the FDA allow that codes may not be used in this case provided this is explained in the Study Data Reviews Guide (SDRG) and that some effort is made to obtain them.
LOINC code examples
LOINC code example - Glucose:
From the CDISC NCI SDTM Controlled terms:
* LBTESTCD (C65047) Term “GLUC” (C105585)
* LBTEST (C67154) Term “Glucose” (C105585)
However, this is simply “A measurement of the glucose in a biological specimen.” Therefore the “specimen” is required to identify the measurement. CDISC NCI SDTM also provides a list of terms for Specimen (see below):
* LBSPEC (C78734 Specimen Type)
* “URINE” (C13283)
* “BLOOD” (C12434)
Within the specimen types, the measurements are also classified by the testing process or method used. CDISC NCI SDTM also provides controlled terms for the different types of measurement e.g. Test strip (dipstick) vs. quantitative measurement, so LBMETHOD is also needed:
* LBMETHOD (C85492 Method)
* “DIPSTICK” (C106516)
LOINC code example – CDISC SDTM Labs (LB) domain
The table below shows how the CDISC SDTM Labs (LB) domain requires several columns populated to correctly identify a particular clinical measurement.
To represent a lab test in SDTM, several columns (LBTESTCD, LBSPEC, LBMETHOD etc.) are required, each having a different set of controlled terms. As the source data may not contain sufficient detail, it can be difficult to correctly classify a Lab test directly into CDISC SDTM – especially partial data such as local lab results.
However, LOINC provides a unique “code” for each test, along with an SDTM mapping to specimen type, method, units, etc. Below is a breakdown of the LOINC codes for the Glucose example:
Glucose (Urine) returns 27 tests, classified by time (timepoint, 6, 8, 10, 24 hours) and method (Quantitative, Test Strip, Test Strip automated).
So for [timepoint], [test strip] there are three possible tests, depending on the units e.g. “positive/negative”, “mmol/L” , “mg/dL”.
Most results in SDTM are submitted in the SI units, in this case, “mmol/L”, “mg/dL” being the conventional units. If the result is simply “positive/negative”, this would be submitted as a “character” result.
From the LOINC table and the CDISC NCI controlled terms, we can construct a code table to translate between the two schemes.
Note the code table should map the LOINC code (5792-7) to the CDISC Code (C1005585), not the free text (Glucose). This is because the free text is subject to revision between CDISC controlled term versions, but the CDISC Code stays constant for a given concept.
With the code table in place, a lab result would be processed as follows:
First get the SDTM LBTESTCD/LBTEST string pair
Next, get the specimen type:
Method (LBMETHOD)
Original Unit (LBORRESU)
It is common for the unit string supplied by the EDC or central lab to not match the CDISC NCI string e.g. “MG/DL” instead of “mg/dL”. This requires manual mapping correction. However, with the LOINC code table, we know what the correct unit string should be (mg/dL), so can insert the value from the code table. A simple programmed check can compare both strings and trigger an alert for manual review.
We can extend the LOINC code table to assign the “SI Unit”, “Conventional Unit” and then document the conversion factor between the two.
Example Lab (LB) output
This process can be applied to other controlled terms, it is not restricted to just CDISC NCI SDTM. Additional information such as “Lab category” (LBCAT), can also be assigned using the LOINC code.
LOINC code example – Lab Category (LBCAT)
LOINC code example – Lab Toxicity Grade (LBTOXGR)
To calculate LBTOXGR, the measurement or lab test must first be identified and the appropriate grade threshold calculations found from the “Common Terminology Criteria for Adverse Events (CTCAE)” documentation. Having the LOINC code makes this process more reliable, as the grade calculation may depend on several factors (specimen, unit, etc). Additional information, such as the subject’s gender, may have to be included to select the correct calculation.
LOINC Code Example – “Conventional Unit” to “SI Unit” conversion
Normalizing “Original Result” (–ORRES) to standard result numeric (–STRN) is common for most SDTM Findings datasets. This process relies on the correct identification of the measurement being standardized. We often find different conversion factors are required depending on the original unit, specimen, etc. So LOINC codes are very helpful in controlling this process. Note this process can cover Lab, ECG, Vital Signs, etc. as these all have available LOINC codes.
To drive the normalization process, we first prepare a code table aligning the LOINC code to the units.
Conversion table:
Conversion table cont.
And using the equation below to calculate the normalized value:
If the conversion factors are not fractions e.g. 0.55 instead of 5/9, we simply set “Constant A” to the multiplier (0.55) and set “Constant B” to one. This will introduce a rounding error, which should be documented in the conversion table “Comment” column. Additional information, such as LOINC table version number, measurement name, specimen type, or additional notes, should be included to provide context.
In preparation for the normalization process, we must align the “Original Units” to the CDISC/NCI terminology. This is done using a code table to convert the collected “Unit string” into the CDISC NCI Unit NCI Code.
It’s recommended to “clean” the collected unit first e.g. upper-case the string and remove trailing spaces etc. This reduces the number of entries in the alignment code table and also makes the SDTM conversion more robust.
CDISC NCI also provides a list of synonyms for each submission value. These can be imported into the alignment table from the NCI documentation. For example:
Example of data cleaning process:
With the aligned VSORRESU, for each input data row we perform a lookup using the LOINC code and “Unit NCICode” to get the conversion factors (Constant A, Constant B, Constant C, Constant K) and apply the normalization formula:
Example output:
This approach has several advantages:
* Correct identification of measurement (Body Temperature).
* Uses the “unit” NCI code – not the collected text – which provides a “submission ready” unit.
* Allows different conversion factors for each measurement. For example, it is quite common for conversion factors to depend on specimen type or method.
* Provides full documentation of the normalization process (often requested for FDA submissions).
* Standardizes the process across all studies – but neutral to CDISC NCI SDTM terminology version, by using the NCI Code, not “unit” text.
Where to find the LOINC Codes list
Find information on common LOINC Codes, including lists of the most frequently used codes, on the LOINC website.
Why use LOINC codes for STDM?
* Greater consistency
* Easier analysis and reporting
* More efficient, standardized process
Using the above approach gives a consistent assignment of several SDTM LB columns, and provides a mechanism to standardize the unit normalization process.
For local lab data or results without LOINC codes, there is a robust coding process available, with several tools for the assignment of LOINC.
LOINC also provides the option of a more granular and reliable grouping of results for analysis and reporting, as each lab test is identified.
Standard processes, such as normalization to SI units, assignment of “Category”, calculation of “LBTOXGR” etc. can all leverage the LOINC code to provide a fully documented and standardized process. This can be applied across all study SDTM conversions.
Finally, LOINC Codes are available for several SDTM findings datasets e.g. Labs (LB), Vital Signs (VS), ECG (EG), etc.
Want to find out more about SDTM and using LOINC codes? Read our blog all you need to know about SDTM!
About the author
Ed Chappell
Solutions Consultant | Formedix
Ed Chappell has been working as a Solutions Consultant with Formedix for over 15 years, and has 22 years’ experience in data programming. He authored and presents our training courses for SEND, SDTM, Define-XML, ODM-XML, Define-XML and Dataset-XML.
Ed was heavily involved in the development of our ryze dataset mapper, and works closely with customers on SDTM dataset mapping. As an expert in clinical data programming, Ed also supports customers with Interim Analysis (IA) SDTM and FDA SDTM clinical trial submissions.
-
Head of Product Management Kevin Burges - CDISC Volunteer Spotlight
Our own Kevin Burges, Head of Product Management, recently featured in CDISC's monthly Volunteer Spotlight!
-
Certara acquires Formedix LLC to increase data standardization across clinical trials for faster study setup, time to analysis and submissions
Formedix ryze cloud based clinical metadata repository and SDTM automation suite and Certara Pinnacle 21 data validation platform combine to power faster clinical trials.
-
Come find us at PHUSE EU Connect 2023 - booth #1! | Formedix
📅 5th – 8th November 2023
-
See us at the 2023 CDISC US Interchange in October! | Formedix
📅 18th – 19th October 2023
-
We'll be in San Diego for SCDM’s annual conference 2023! | Formedix
📅8th - 11th October 2023
-
How clinical trial technology can streamline the research process
In the field of drug development, the journey from lab discoveries to new treatments reaching the market is long and complex. Clinical trials play a crucial role in this journey: they bridge the gap between theory and practice.
However, traditional methods of conducting clinical trials can also be complicated. Problems with inefficient data collection and slow processing times can plague trials.
-
Using Define-XML for faster, better quality and more efficient studies
Since its inception in 1997, the
Clinical Data Interchange Standards Consortium (CDISC)
has developed and supported globally-adopted data standards to improve clinical trial efficiency. Clinical data standards are now recognized as playing a vital role in the entire end-to-end clinical trial process. Standardization allows for faster, better quality and less costly drug discovery.
One of the most widely used standards today is Define-XML. The latest version of Define-XML is v2.1, which went live in May 2019. Get release information on the CDISC website.
What is Define-XML?
According to CDISC: “Define-XML is required by the United States Food and Drug Administration (FDA) and the Japanese Pharmaceuticals and Medical Devices Agency (PMDA) for every study in each electronic submission to inform the regulators which datasets, variables, controlled terms, and other specified metadata were used.”
The FDA’s Technical Conformance Guide explains that Define-XML is "arguably the most important part of the electronic dataset submission for regulatory review” because it helps reviewers gain familiarity with study data, its origins and derivations.
The standard itself is known as ‘Define-XML’. The file that’s submitted to the FDA upon completion is the data definition file, known simply as ‘define.xml’.
Define-XML as a dataset descriptor
It is commonly thought that Define-XML is simply a dataset descriptor: a way to document what datasets look like, including the names and labels of datasets and variables, what terminology is used etc. This is essentially what Define-XML was created for.
But by instead thinking of Define-XML as a tool to create better quality, more efficient clinical studies, users can unlock the true potential of the standard.
Progressive uses of Define-XML
You can use Define-XML to help you optimize the end-to-end clinical trial process in the following ways:
1) Use Define-XML in your CRF design process
Many organizations treat Define-XML as an afterthought: only when case report forms (CRFs) are designed, data is collected and the study is complete do they think about creating the define.xml file for FDA submission.
But this approach can lead to incomplete data, the need for protocol amendments, complex mapping, increased quality control. How do you know when designing the CRF that you’re collecting all the relevant SDTM data? And when the data has been collected, how can you verify the submission is what you intended when you have no study definition to compare it with? It can take valuable time and resources to make sure all data has been collected in the right format, and ultimately can elongate the study process.
A more efficient approach is to use Define-XML to define your study, end-to-end, right at the start. This includes defining SDTM, SEND, and ADaM datasets upfront.
Using Define-XML and SDTM to design submission datasets at the start of a study makes it easier to set up your study and create your case report forms (CRFs). By setting out what information should ultimately appear in your submission datasets before you collect any patient data, you can create CRFs with confidence, knowing that you’re collecting all the required information in the right format.
For example, the SDTM standard gives the ‘Identifier’, ‘Topic’, ‘Qualifier’, and ‘Timing’ variables required in your submission datasets. If you know upfront what variables to use, you can create your CRFs accordingly.
You can also do your dataset annotation of CRFs with SDTM variables upfront. This can help ensure all your collected data has a place in SDTM. This has the additional benefit of providing basic mapping between the forms and the datasets. CDISC provides a mechanism to extend Define-XML which is permissible and allows the storage of additional metadata such as complex dataset mappings (e.g. how data may be merged into one single dataset from two sources).
In this way, using Define-XML upfront, rather than retrospectively, can help you ensure your study is a success. Read more about using Define-XML for dataset design.
2) Use Define-XML in EDC data conversions
Define-XML is not limited to just describing CDISC SDTM and ADaM dataset structures. From an electronic data capture (EDC) system, you can export proprietary dataset formats which can be described using the Define-XML model. With the right tools, you can automatically generate a Define-XML that describes the EDC export datasets using the CRFs/eCRFs themselves. This can then be displayed in a friendly HTML or PDF format allowing early visibility of the datasets that will be delivered by the EDC system.
The Source Proprietary dataset spec enables upfront mapping of EDC datasets to SDTM datasets. These mappings can be described (and made machine-executable) using extensions to Define-XML and human-readable SDTM mapping specifications produced automatically, aiding review and approval of mappings.
In addition, the Define-XML mapping extensions provide a machine-executable format that can be processed by data transformation code to enable the automatic conversion of datasets in commercially available tools.
The diagram below shows the flow of data from data capture through to CDISC datasets and the part CDISC metadata plays. Metadata is used in designing data capture forms using CDISC ODM and Define-XML in designing destination datasets. All of this vendor-neutral metadata can form the basis of form and dataset libraries which can be re-used from study to study.
3) Creating and re-using dataset libraries
Define-XML is the perfect tool to help you create libraries of datasets (EDC, SDTM, ADaM), mappings, page links to CRF variables, and so on for re-use from one study to the next.
A metadata-driven approach using Define-XML can optimize a single study from set-up to submission. But creating libraries of reusable metadata will make future studies even more efficient.
If you have a library of data acquisition forms, proprietary EDC datasets, SDTM datasets, ADaM, and dataset mappings that are approved internally and ready to use, you’ll only have to create new content where there is a specific requirement for it. All other approved metadata is already there in your library.
4) Automating dataset validation
Another major advantage to defining datasets upfront is that validation can also be done up front. By creating a prospective definition of the intended datasets at the start of the study, it is possible to machine-validate study dataset designs for conformance to external standards. It is also possible to validate that populated datasets match the original specifications. This way, data quality and submission compliance are built-in upfront with less reliance on downstream validation.
We go into a little more detail on validation possibilities below:
* Compare study dataset designs, including controlled terminology, to external and internal standards
When designing SDTM datasets and creating controlled terms, it is imperative that these comply with the latest and/or chosen version of National Cancer Institute Controlled Terminology (NCI CT). During the dataset design phase, automatic comparisons and compliance checks should be made with the appropriate version of NCI CT.
Companies should also develop their own domains that comply with CDISC SDTM but include content that falls outside of the standard Implementation Guide domains. For example, specialist findings domains may be required for a particular therapeutic area. In this scenario, companies should compare study dataset designs against their own data standards to check for differences and either accept or reject them accordingly.
* Compare ‘As specified’ study dataset specification against ‘As delivered’ study dataset designs
Increasingly, studies are outsourced to Contract Research Organizations (CRO) and this leads to an increased burden on sponsors. This tends to happen in two areas: (a) upfront specification of deliverables and (b) downstream validation of those deliverables.
When dataset validation is done upfront, a human-readable target SDTM specification (in HTML, PDF, Word or Excel) can be given to a CRO to describe what is expected to be in the delivered datasets: an ‘As specified’ study dataset specification.
When CROs return the datasets, they should also provide ‘As delivered’ study dataset metadata. With both ’As specified’ and ’As delivered’ study dataset metadata available, it is easy to compare the study dataset metadata to verify that the ’As delivered’ dataset actually matches what was specified.
* Compare dataset data to dataset metadata and SDTM or ADaM
Having a target SDTM Define-XML available upfront allows automated comparison of delivered datasets against study dataset metadata, either as specified or as delivered. Comparing data to as specified Define-XML verifies that the data matches what was originally intended/specified. And comparing data to as delivered Define-XML ensures that the data matches the dataset definition. This is important as it will ultimately be this as delivered Define-XML that is submitted to the FDA.
You can also compare with CDISC standards such as SDTM and ADaM using the CDISC Open Rules Engine (CORE). CORE is a validation engine that can help you easily validate against the standardized conformance rules published by CDISC. In April 2023, we released Formedix CORE, a free desktop application for validating datasets using the CORE engine. This will validate against both SDTM/ADaM, and the Define-XML metadata. CDISC will also be adding the ability to validate against regulatory rules (FDA, PMDA etc) in the future.
Define.xml file submission
As we’ve shown, there are many benefits to using Define-XML - not only as a dataset descriptor, but as a means to streamline the clinical study process.
Define-XML should not be thought of as simply a submission deliverable, but as a CDISC model that helps optimize the end-to-end clinical trial process. It can be used to establish dataset libraries that promote study-to-study re-use, as well as to drive efficiencies through expedited study set-up and streamlined dataset conversions.
Learn how you can create Define XML in one click with our Visual Define.XML editor or check out of blog to what's new in Define-XML 2.0 including define xml 2.0 examples.
.
The 6 dos and don’ts of Define-XML
Ready to find out more about Define-XML creation?
Download our free guide to the 6 dos and don’ts of Define-XML to help you implement the standard and get your study submission ready.
Author's note: this blog post was originally published in May 2014 and has been updated for accuracy and comprehensiveness.
About the author
Kevin Burges
Head of Product Management | Formedix
Kevin Burges has been working at Formedix for over 20 years. Over time his role has changed from Developer to Senior Developer, to Technical Director and now Head of Product Management.
Kevin has a strong interest in metadata management and automation as an engine for streamlining clinical trials, and he works closely with customers to evolve the ryze platform with their needs in mind. He has also worked closely with CDISC since 2000, and has won awards for outstanding achievement towards advancing CDISC standards.
Nowadays, he’s part of the Data Exchange Standards team, which includes ODM, Define XML and Dataset XML.