Session 1 — Write your first application definition¶

Duration: 1 hour

Goal: Create NXdouble_slit — a complete, valid application definition for the double-slit experiment — starting from a blank YAML file.

Info

We will go through this example for writing an application definition rather quickly. If you want to later similar material at a more incremental later, you can have a look at the pynxtools tutorial for writing an application definition:

pynxtools > Tutorial > Writing your first application definition

Each step you will see in this guide is already pre-defined and ready to be copied. For each step, try to understand what the individual concepts are and how the notation works before you copy the snippets to your application definition.

Why a double-slit experiment?¶

It's physically simple (one source, two slits, one detector), but captures every structural challenge in data modelling:

Instrument geometry (distances, gaps)
A 2D measurement array with calibrated axes
A distinction between raw and processed data
Optional metadata (material, coherence length)

The skills you learn here transfer directly to XPS, MPES, reflectometry, or any other technique.

The data model we want to build¶

NXentry
 ├── definition = "NXdouble_slit"
 ├── title, start_time, end_time
 ├── NXinstrument
 │   ├── source(NXsource)
 │   │   ├── wavelength  [NX_WAVELENGTH]
 │   │   ├── coherence_length  [NX_LENGTH, recommended]
 │   │   └── type  [Laser | Filtered lamp | LED, optional]
 │   ├── double_slit(NXslit)
 │   │   ├── x_gap, slit_separation  [NX_LENGTH]
 │   │   └── height, material  [optional]
 │   └── detector(NXdetector)
 └── interference_pattern(NXdata)  ← calibrated default plot
     ├── data  dims (n_x, n_y)
     ├── x_offset  [NX_LENGTH]
     └── y_offset  [NX_LENGTH]

interference_pattern contains spatial axes (mm from centre). This is the used as the default plot by NOMAD and other viewers.

Note that the NXdouble_slit application definition is already part of pynxtools. We will still build it from scratch here, but this will help us with using it directly, without needing to manually inject it there.

Step 0 — Tools¶

nyaml¶

NeXus definitions must follow the NXDL language. Typically, they are written in XML. Here, we are using a simpler YAML notation. The nyaml tool you installed earlier will help us losslessly convert between YAML and XML.

# Convert YAML ↔ NXDL XML
nyaml2nxdl NXdouble_slit.yaml --output-file NXdouble_slit.nxdl.xml
nxdl2nyaml NXdouble_slit.nxdl.xml --output-file NXdouble_slit.yaml

pynxtools dataconverter¶

Generate a template JSON showing all required paths:

dataconverter generate-template --nxdl NXdouble_slit

Step 1 — Skeleton¶

Create NXdouble_slit.yaml in a new working directory:

category: application
doc: |
  Application definition for a double-slit interference experiment.
type: group
NXdouble_slit(NXobject):
  (NXentry):
    definition:
      enumeration: [NXdouble_slit]
    title:
    start_time(NX_DATE_TIME): |
      ISO 8601 datetime of the measurement start.

You can see its XML representation by running the converter:

nyaml2nxdl NXdouble_slit.yaml --output-file NXdouble_slit.nxdl.xml

Step 2 — Instrument and source¶

Add the instrument and light source inside (NXentry):

    (NXinstrument):
      source(NXsource):
        wavelength(NX_FLOAT):
          unit: NX_WAVELENGTH
          doc: |
            Central wavelength of the light source.
        coherence_length(NX_FLOAT):
          unit: NX_LENGTH
          exists: recommended
          doc: |
            Temporal coherence length — determines fringe visibility.
        type(NX_CHAR):
          exists: optional
          enumeration: [Laser, Filtered lamp, LED]

Unit categories

unit: NX_WAVELENGTH is a category, not a unit. It says the field must store a wavelength-equivalent quantity (nm, Å, µm, …). The actual unit is written by the file producer as an HDF5 attribute wavelength/@units.

Step 3 — Slit and detector¶

Add the slit and detector inside (NXinstrument):

      double_slit(NXslit):
        x_gap(NX_FLOAT):
          unit: NX_LENGTH
          doc: |
            Width of each individual slit.
        slit_separation(NX_FLOAT):
          unit: NX_LENGTH
          doc: |
            Center-to-center distance between the two slits.
        height(NX_FLOAT):
          exists: optional
          unit: NX_LENGTH
          doc: |
            Height of the two slits.
      detector(NXdetector):
        distance(NX_FLOAT):
          unit: NX_LENGTH
          doc: Distance from the slit plane to the detector surface.

Step 4 — Optional processing + default plot¶

Add these two groups as siblings of (NXinstrument) inside (NXentry).

Optional processing provenance (uses `nameType="partial"`)¶

    processID(NXprocess):
      nameType: partial
      exists: optional
      doc: |
        One step in the pipeline from raw pixels to calibrated offsets.
        Replace 'ID' with a short name, e.g. 'pixel_calibration'.
        Multiple NXprocess groups are allowed.
      sequence_index(NX_INT):
        doc: Step order in the chain (1-based).
      description(NX_CHAR):
        doc: What this step does.
      program(NX_CHAR):
        exists: optional
      version(NX_CHAR):
        exists: optional
      date(NX_DATE_TIME):
        exists: optional

Default plot¶

Note how here we are using enumeration to indicate the \@signal and \@axes attribute must match the fields defined within NXdata.

    interference_pattern(NXdata):
      doc: |
        Default plot: the calibrated 2D interference pattern with spatial
        axes. The signal data may be identical to the raw detector array or
        derived from it via one or more NXprocess steps.
      \@signal:
        enumeration: [data]
      \@axes:
        enumeration: [['x', 'y']]
      data(NX_NUMBER):
        unit: NX_ANY
        doc: |
          2D interference intensity after any processing steps.
        dimensions:
          rank: 2
          dim: (n_x, n_y)
      x(NX_FLOAT):
        unit: NX_LENGTH
        doc: |
          Horizontal spatial offset from the detector centre, derived from
          pixel index and pixel pitch.
        dimensions:
          rank: 1
          dim: (n_x,)
      y(NX_FLOAT):
        unit: NX_LENGTH
        doc: |
          Vertical spatial offset from the detector centre, derived from
          pixel index and pixel pitch.
        dimensions:
          rank: 1
          dim: (n_y,)

Here, we are using symbolic names (n_x, n_y) to name the array dimensions and reference them in . Using symbolic names instead of hardcoded integers makes the definition self-documenting and allows validation tools to verify dimensional consistency across fields.

Add the dimension symbols at the top of the file, outside of the class NXdouble_slit and before type: group:

symbols:
  doc: |
    Dimension symbols used in this definition.
  n_x: |
    Number of detector pixels along x.
  n_y: |
    Number of detector pixels along y.

Step 5 — Validate¶

Convert to NXDL XML and inspect:

nyaml2nxdl NXdouble_slit.yaml --output-file NXdouble_slit.nxdl.xml

As discussed earlier, the application definition is already used inside pynxtools

Run:

dataconverter generate-template --nxdl NXdouble_slit

You should now see a dictionary-style output. You should see e.g. /ENTRY[entry]/title and /ENTRY[entry]/start_time in the output. This is a template for NXdouble_slit that we will fill in the next step.

Concept vs. instance paths

The template shows ENTRY (upper case) — the concept — and [entry] (lower case in brackets) — the default instance name. The concept is the schema; the instance is what gets written to the file.

Look for these paths in the output:

/ENTRY[entry]/interference_pattern/x_offset
/ENTRY[entry]/interference_pattern/y_offset
/ENTRY[entry]/processID[processID]/description (optional)
/ENTRY[entry]/INSTRUMENT[instrument]/detector/distance

You can also pass the --required flag to only see those paths that are required to be filled:

dataconverter generate-template --nxdl NXdouble_slit --required

Note that the complet version of NXdouble_slit uses even more concepts to illustrate all the possibilities the NeXus data model provides. Have a look at the complete NXdouble_slit.yaml. Which additional ideas can you detect?

Complete NXdouble_slit.yaml

category: application
doc: |
  Application definition for a double-slit interference experiment.
  Records the light source, aperture geometry, detector layout, and the
  measured 2D interference pattern needed to determine fringe spacing and
  source coherence length.

  See https://en.wikipedia.org/wiki/Double-slit_experiment.
symbols:
  doc: |
    Dimension symbols used in this definition.
  n_x: |
    Number of detector pixels along x.
  n_y: |
    Number of detector pixels along y.
type: group
NXdouble_slit(NXobject):
  (NXentry):
    definition:
      enumeration: [NXdouble_slit]
    title:
    start_time(NX_DATE_TIME):
      doc: |
        ISO 8601 datetime of the measurement start.
    end_time(NX_DATE_TIME):
      exists: recommended
    (NXinstrument):
      source(NXsource):
        wavelength(NX_FLOAT):
          unit: NX_WAVELENGTH
          doc: |
            Central wavelength of the light source.
        coherence_length(NX_FLOAT):
          unit: NX_LENGTH
          exists: recommended
          doc: |
            Temporal coherence length of the source.
        type(NX_CHAR):
          exists: optional
          enumeration: [Laser, Filtered lamp, LED]
      double_slit(NXslit):
        x_gap(NX_FLOAT):
          unit: NX_LENGTH
          doc: |
            Width of each individual slit.
        slit_separation(NX_FLOAT):
          unit: NX_LENGTH
          doc: |
            Center-to-center distance between the two slits.
        height(NX_FLOAT):
          exists: optional
          unit: NX_LENGTH
          doc: |
            Height of the two slit.
      detector(NXdetector):
        distance(NX_FLOAT):
          unit: NX_LENGTH
          doc: |
            Distance from the slit plane to the detector surface.
    processID(NXprocess):
      nameType: partial
      exists: optional
      doc: |
        Describes one step in the processing chain that converts raw detector
        pixel data to the calibrated interference pattern stored in
        ``interference_pattern``. The 'ID' suffix in the group name is replaced
        by a short identifier chosen by the writer, e.g. 'pixel_calibration'
        or 'background_correction'. Multiple NXprocess groups may be present;
        their order is given by sequence_index.
      sequence_index(NX_POSINT):
        doc: |
          Sequence index of processing, for determining the order of multiple
          NXprocess steps. Starts with 1.
      description(NX_CHAR):
        doc: |
          Free-text description of what this processing step does.
      program(NX_CHAR):
        exists: optional
        doc: |
          Version string of the software.
      version(NX_CHAR):
        exists: optional
      date(NX_DATE_TIME):
        exists: optional
    interference_pattern(NXdata):
      doc: |
        Default plot: the calibrated 2D interference pattern with spatial
        axes. The signal data may be identical to the raw detector array or
        derived from it via one or more NXprocess steps.
      \@signal:
        enumeration: [data]
      \@axes:
        enumeration: [['x', 'y']]
      data(NX_NUMBER):
        unit: NX_ANY
        doc: |
          2D interference intensity after any processing steps.
        dimensions:
          rank: 2
          dim: (n_x, n_y)
      x(NX_FLOAT):
        unit: NX_LENGTH
        doc: |
          Horizontal spatial offset from the detector centre, derived from
          pixel index and pixel pitch.
        dimensions:
          rank: 1
          dim: (n_x,)
      y(NX_FLOAT):
        unit: NX_LENGTH
        doc: |
          Vertical spatial offset from the detector centre, derived from
          pixel index and pixel pitch.
        dimensions:
          rank: 1
          dim: (n_y,)

Advanced: specialize a base class (bonus)¶

If your source is always a laser, you can create a dedicated NXlaser base class rather than repeating the specialization in every application definition:

# NXlaser.yaml
category: base
doc: A specialization of NXsource for coherent laser sources.
type: group
NXlaser(NXsource):
  wavelength(NX_FLOAT):
    doc: Central wavelength of the laser line.
  coherence_length(NX_FLOAT):
    unit: NX_LENGTH
    exists: recommended
  type(NX_CHAR):
    enumeration: [Laser]

Then in NXdouble_slit, replace source(NXsource) with source(NXlaser) and only list which fields are required or recommended — everything defined in NXlaser is inherited automatically.

```yaml
NXdouble_slit(NXobject):
  (NXentry):
    (NXinstrument):
      source(NXlaser):
        wavelength(NX_FLOAT):
        coherence_length(NX_FLOAT):
          exists: recommended

Info

This is just an exercise for you to understand NeXus better. We will not use NXlaser in the upcoming examples. Find a complete NXlaser.nxdl.xml in the pynxtools examples.

Appendix: How to add your definition to pynxtools¶

As we said above, NXdouble_slit.nxdl.xml is already part of pynxtools, so you don't need to add it for the following steps. However, if you create another application definition NXmytechnique, there are two possibilities of adding it.

Local development (fastest):

In order to use your application definitions and base classes directly, you will need to add them to the NeXus definitions stored in pynxtools. For this, you need to install pynxtools in editable mode. You can learn more in the pynxtools development guide.

Install pynxtools with the -e option in the same virtual environment that you are already working in. Instantiate the definitions submodule.

Then you can place your application definition NXDL XML file in pynxtools:

cp NXmytechnique.nxdl.xml src/pynxtools/definitions/contributed_definitions/
dataconverter generate-template --nxdl NXmytechnique

Community contribution (permanent):

The more permanent way is to add the new application definition (or base class) to the FAIRmat NeXus definitions repository. That ensures that others can use it and that it can eventually be brought to standardization with the NeXus International Advisory Committee (NIAC):

Fork FAIRmat-NFDI/nexus_definitions
Add your NXDL file to contributed_definitions/
Open a pull request
Once merged, update the pynxtools submodule: ./scripts/definitions.sh update