Ophys Metadata Structure#

This document describes the architecture of the optical physiology (ophys) metadata system in NeuroConv. It is intended for developers who are contributing new interfaces or modifying existing ones.

For user-facing documentation on how to annotate ophys data, see How to Annotate Optical Physiology Data.

Design Principles#

The ophys metadata system is built on several core principles:

  1. Dictionary-Based Organization Metadata is organized using dictionaries with meaningful keys. This structure makes metadata easier to reference, organize, and extend. Dictionaries allow direct access to specific components by name, which is clearer and less error-prone than positional access.

    metadata["Ophys"]["ImagingPlanes"]["visual_cortex"]["indicator"] = "GCaMP6s"
    
  2. Consistent metadata_key Across Interfaces Every ophys interface uses a single metadata_key parameter that propagates to all its components (Device, ImagingPlane, PhotonSeries). This provides a consistent pattern across all interfaces, making the API predictable and easier to learn.

  3. Explicit References Components reference each other using explicit _metadata_key fields. This makes relationships between components clear and enables validation.

  4. Top-Level Devices Devices are stored at the top level (metadata["Devices"]) enabling device sharing across ophys, ecephys, and other modalities.

  5. Provenance-First get_metadata() The get_metadata() method returns only values extracted from the source data, not defaults. Defaults are applied at NWB object creation time.

Metadata Structure Overview#

The complete ophys metadata structure:

metadata = {
    "NWBFile": {...},  # Session-level metadata
    "Subject": {...},  # Subject information

    "Devices": {
        "visual_cortex": {
            "name": "Microscope",
            "description": "Two-photon microscope for visual cortex imaging",
            "manufacturer": "Bruker"
        },
        "hippocampus": {
            "name": "Miniscope",
            "description": "UCLA Miniscope v4 for hippocampal imaging"
        }
    },

    "Ophys": {
        "ImagingPlanes": {
            "visual_cortex": {
                "name": "ImagingPlaneVisualCortex",
                "description": "Imaging plane in V1 layer 2/3",
                "device_metadata_key": "visual_cortex",  # Reference to device
                "excitation_lambda": 920.0,
                "indicator": "GCaMP6s",
                "location": "V1 binocular zone",
                "optical_channel": [
                    {
                        "name": "GreenChannel",
                        "description": "GCaMP emission channel",
                        "emission_lambda": 510.0
                    }
                ]
            },
            "hippocampus": {
                "name": "ImagingPlaneHippocampus",
                "device_metadata_key": "hippocampus",
                "excitation_lambda": 470.0,
                "indicator": "GCaMP6f",
                "location": "CA1 pyramidal layer",
                "optical_channel": [...]
            }
        },

        "MicroscopySeries": {
            "visual_cortex": {
                "name": "TwoPhotonSeriesVisualCortex",
                "description": "Two-photon calcium imaging",
                "imaging_plane_metadata_key": "visual_cortex",  # Reference to imaging plane
                "unit": "n.a.",
                "dimension": [512, 512]
            },
            "hippocampus": {
                "name": "OnePhotonSeriesHippocampus",
                "description": "Miniscope calcium imaging",
                "imaging_plane_metadata_key": "hippocampus",
                "unit": "n.a.",
                "dimension": [480, 752]
            }
        },

        "PlaneSegmentations": {
            "suite2p_analysis": {
                "name": "PlaneSegmentation",
                "description": "ROIs detected by Suite2p",
                "imaging_plane_metadata_key": "visual_cortex"
            }
        },

        "RoiResponses": {
            "suite2p_analysis": {
                "raw": {
                    "name": "RoiResponseSeries",
                    "description": "Raw fluorescence traces",
                    "unit": "n.a."
                },
                "neuropil": {
                    "name": "Neuropil",
                    "description": "Neuropil fluorescence",
                    "unit": "n.a."
                },
                "deconvolved": {
                    "name": "Deconvolved",
                    "description": "Deconvolved activity",
                    "unit": "n.a."
                },
                "denoised": {
                    "name": "Denoised",
                    "description": "Denoised activity",
                    "unit": "n.a."
                },
                "baseline": {
                    "name": "Baseline",
                    "description": "Baseline fluorescence",
                    "unit": "n.a."
                },
                "dff": {
                    "name": "DfOverF",
                    "description": "Delta F over F",
                    "unit": "n.a."
                }
            }
        },

        "SegmentationImages": {
            "name": "SegmentationImages",
            "description": "Summary images from segmentation",
            "suite2p_analysis": {
                "correlation": {
                    "name": "correlation_image",
                    "description": "Correlation image from Suite2p"
                },
                "mean": {
                    "name": "mean_image",
                    "description": "Mean image from Suite2p"
                }
            }
        }
    }
}

The metadata_key Parameter#

All imaging and segmentation interfaces accept a metadata_key parameter. This parameter is keyword-only to ensure explicit usage.

class SomeOphysInterface(BaseDataInterface):
    def __init__(
        self,
        *,  # Force keyword-only
        verbose: bool = False,
        metadata_key: Optional[str] = None,
        **source_data,
    ):
        self.metadata_key = metadata_key
        ...

The argument name metadata_key is the same across all interfaces ensuring a common API. When None (the default), the interface automatically generates a unique key from the parameters that make the interface unique (e.g. stream name, channel name). When the user passes an explicit value, they take responsibility for uniqueness and can use it to intentionally share or customize metadata keys.

The default is None rather than a hardcoded string (e.g. "caiman_segmentation") for consistency across interfaces. Multi-stream and multi-channel interfaces (like ScanImageImagingInterface) cannot have a fixed default because the key must include runtime parameters such as channel_name and plane_index. Using None as the sentinel and resolving the default in __init__ lets every interface share the same pattern: simple interfaces pick a static string, and parametric interfaces build the key from their constructor arguments.

Key Propagation#

The metadata_key parameter determines the entry point for the interface’s primary object(s). For imaging interfaces, this is the MicroscopySeries; for segmentation interfaces, this is the PlaneSegmentation and RoiResponses. Linked objects (ImagingPlane, Device) are resolved through their own _metadata_key references, which may point to different entries.

For an imaging interface with metadata_key="visual_cortex":

  • metadata["Ophys"]["MicroscopySeries"]["visual_cortex"] - The primary object (direct lookup via metadata_key)

  • metadata["Ophys"]["ImagingPlanes"][imaging_plane_metadata_key] - Resolved via imaging_plane_metadata_key inside the MicroscopySeries entry

  • metadata["Devices"][device_metadata_key] - Resolved via device_metadata_key inside the ImagingPlane entry

For a segmentation interface with metadata_key="suite2p_analysis":

  • metadata["Ophys"]["PlaneSegmentations"]["suite2p_analysis"] - Direct lookup via metadata_key

  • metadata["Ophys"]["RoiResponses"]["suite2p_analysis"] - Direct lookup via the same metadata_key

  • metadata["Ophys"]["ImagingPlanes"][imaging_plane_metadata_key] - Resolved via imaging_plane_metadata_key inside the PlaneSegmentation entry

  • metadata["Devices"][device_metadata_key] - Resolved via device_metadata_key inside the ImagingPlane entry

  • metadata["Ophys"]["SegmentationImages"]["suite2p_analysis"] - Summary images

In the simplest case, all these keys happen to be the same value (the interface’s metadata_key), which is what get_metadata() produces by default. But the indirection through _metadata_key fields allows different components to reference shared resources. For example, two segmentation pipelines can point their imaging_plane_metadata_key to the same ImagingPlane entry, and two imaging planes can point their device_metadata_key to the same Device entry.

Single ImageSegmentation Container#

While having multiple PlaneSegmentations makes sense (different segmentation algorithms like Suite2p vs CaImAn, or multiple runs of the same algorithm), there is no clear use case for multiple ImageSegmentation containers in an NWB file.

PyNWB and the NWB schema allow multiple ImageSegmentation containers, but NeuroConv does not support this. Instead, NeuroConv uses a single, non-editable ImageSegmentation container where all PlaneSegmentations are stored. This is handled internally and users cannot configure the ImageSegmentation container. Users work directly with PlaneSegmentations in metadata, and NeuroConv places them in the single ImageSegmentation container when creating the NWB file.

This simplifies both the metadata specification (no need to manage container names) and the organization of the resulting NWB file.

Unified MicroscopySeries#

Metadata uses a unified MicroscopySeries key for all imaging data, regardless of whether it will be written as TwoPhotonSeries or OnePhotonSeries in the NWB file.

The choice of NWB neurodata type (TwoPhotonSeries vs OnePhotonSeries) is specified as a conversion option, not in metadata. This follows the provenance principle: metadata describes the data, while conversion options determine how to write it to NWB.

For format-specific interfaces (e.g., ScanImageImagingInterface), the series type is extracted from the source data. For generic interfaces (e.g., TiffImagingInterface), users must specify the series type at conversion time:

# Format-specific interface - series type extracted from source
interface = ScanImageImagingInterface(file_path="data.tif", metadata_key="visual_cortex")
interface.add_to_nwbfile(nwbfile, metadata)  # Uses extracted type (TwoPhotonSeries)

# Generic interface - series type must be specified
interface = TiffImagingInterface(file_path="data.tif", metadata_key="visual_cortex")
interface.add_to_nwbfile(nwbfile, metadata, photon_series_type="TwoPhotonSeries")

# Override is always possible
interface.add_to_nwbfile(nwbfile, metadata, photon_series_type="OnePhotonSeries")

Unified RoiResponses#

All ROI trace types (raw fluorescence, neuropil, deconvolved, denoised, baseline, df/f) are stored under a single RoiResponses key in metadata. This consolidates what NWB core splits into separate Fluorescence and DfOverF containers.

At write time, all traces are written as RoiResponseSeries inside a single Fluorescence container, without splitting into Fluorescence and DfOverF. This follows the direction of nwb-schema#616 and matches ndx-microscopy’s single-container pattern (MicroscopyResponseSeriesContainer).

Alignment with ndx-microscopy#

The metadata structure is designed to align with the ndx-microscopy extension, which represents the future direction of optical physiology in NWB.

ndx-microscopy uses:

  • MicroscopySeries for all imaging data (instead of separate TwoPhotonSeries/OnePhotonSeries)

  • MicroscopyResponseSeries for all ROI traces (instead of separate Fluorescence/DfOverF)

By adopting similar patterns (MicroscopySeries, RoiResponses), NeuroConv’s metadata structure will require minimal changes when ndx-microscopy is integrated into NWB core. This makes the eventual transition smoother for users.

Linking and Object Creation#

Each interface’s goal is to create its primary object(s) in NWB. For example, an imaging interface creates a MicroscopySeries (e.g. TwoPhotonSeries, OnePhotonSeries). The metadata specifies attributes of that object (name, description, unit, etc.) but also its linked objects: an ImagingPlane for the series, and in turn a Device for the ImagingPlane.

Contained vs Linked Components#

In NWB, some components are fully contained within their parent while others exist as separate, linked objects. This distinction affects how they are represented in metadata:

Contained components like optical_channel are fully specified as nested metadata inside their parent. An ImagingPlane’s optical channels are defined directly within the ImagingPlane metadata dictionary because they exist only within that ImagingPlane.

Linked components like Device and ImagingPlane are separate NWB objects that can be shared or referenced by multiple other objects. For example, an ImagingPlane must reference the Device (microscope) that was used to acquire the data, and a TwoPhotonSeries must reference the ImagingPlane where the imaging occurred.

How Linking Works#

In the metadata dict, we don’t have actual NWB objects yet, only dictionaries describing them. To express relationships between linked components, we use special _metadata_key fields that contain the key of the referenced component.

device_metadata_key is used in ImagingPlane to reference its Device:

imaging_plane = {
    "name": "ImagingPlane",
    "device_metadata_key": "visual_cortex",  # Points to metadata["Devices"]["visual_cortex"]
    ...
}

imaging_plane_metadata_key is used in PhotonSeries and PlaneSegmentation to reference their ImagingPlane:

photon_series = {
    "name": "TwoPhotonSeries",
    "imaging_plane_metadata_key": "visual_cortex",  # Points to ImagingPlanes["visual_cortex"]
    ...
}

plane_segmentation = {
    "name": "PlaneSegmentation",
    "imaging_plane_metadata_key": "visual_cortex",
    ...
}

This allows multiple components (e.g., multiple segmentation pipelines) to reference the same ImagingPlane, as shown in the how-to guide for annotating multiple segmentations of the same data.

When Objects Are Created#

Linked objects (Devices, ImagingPlanes, etc.) are not created when the metadata dict is assembled. They are created when add_to_nwbfile is called. The metadata dict defines what could be created, and the _metadata_key references determine what actually gets written to the NWB file. At that point, the string references are resolved to actual NWB objects.

The rules are:

  1. Only entries that are actually referenced by other objects (via _metadata_key fields) are created. Entries that exist in the metadata dict but are not referenced by anything will not be written to the file. This means you can define all the devices of a conversion in a shared YAML and only the ones that are actually linked will end up in the NWB file.

  2. If a required link is missing (e.g. an ImagingPlane has no device_metadata_key) and the object requires a linked object (e.g. an ImagingPlane requires a Device), a default object will be created and linked automatically at writing time.

  3. For shared resources (e.g. two imaging planes using the same microscope), the user or the converter sets the _metadata_key references explicitly. The object is created by whichever interface writes first, and subsequent interfaces reuse the existing object.

# Two imaging planes sharing one device
metadata["Devices"]["shared_microscope"] = {
    "name": "Microscope",
    "description": "Two-photon microscope used for both planes",
    "manufacturer": "Thorlabs",
}

metadata["Ophys"]["ImagingPlanes"]["plane_area1"] = {
    "name": "ImagingPlaneArea1",
    "location": "V1",
    "device_metadata_key": "shared_microscope",
}

metadata["Ophys"]["ImagingPlanes"]["plane_area2"] = {
    "name": "ImagingPlaneArea2",
    "location": "V2",
    "device_metadata_key": "shared_microscope",
}

Note that device keys (and imaging plane keys) are independent of any interface’s metadata_key. They can be any arbitrary string, as shown by "shared_microscope" above, which does not correspond to any interface’s key. No interface “owns” the device; it is created at write time by whichever interface first follows the reference chain to it.

Because only referenced entries are written to the NWB file, the metadata dict can hold all possible components (e.g. in a shared YAML) and the _metadata_key links control which ones are actually used for each conversion. This enables a workflow where a single YAML file contains the full metadata for a project (all devices, imaging planes, etc.) and is shared across sessions in a multi-session conversion script. For each session, the conversion code sets the _metadata_key references programmatically to select which components to write and how to link them. For example, different sessions might link their imaging planes to different devices, or different segmentation runs might reference different imaging planes, all from the same shared metadata file.