Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
471 changes: 471 additions & 0 deletions components/omega/doc/design/DynamicInputStreams.md

Large diffs are not rendered by default.

135 changes: 135 additions & 0 deletions components/omega/doc/devGuide/DynamicInputStreams.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
(omega-dev-dynamicinputstreams)=

## Dynamic Input Streams

Dynamic input streams extend the [IOStream](#omega-dev-iostreams) mechanism to
support reading fields whose names and secondary dimensions are not known at
compile time. When a stream is configured with `DynamicFields: true` in the
input YAML, fields in that stream do not need to be pre-registered in the field
registry. Instead, each field's dimensions and type are discovered from the
open file at read time, the necessary `Dimension` and `Field` objects are created
automatically, storage is allocated, and the data is read.

Any module requiring this functionality must include the `IOStream.h` header
file. The new `IOStream::readAllDynamic()` call described below replaces
any per-stream `IOStream::read()` calls that would otherwise require the
calling code to know stream names.

### The `DynamicFields` flag

The flag is stored as a private `bool DynamicFields` member on each `IOStream`
instance and defaults to `false`. It is set in `IOStream::create()` when the
stream's YAML block contains `DynamicFields: true`; if the key is absent the
stream behaves as a standard IOStream.

Dynamic streams skip the normal field-existence check in `IOStream::validate()`
because the fields will not exist until the stream is read. The stream is
marked validated immediately so that `IOStream::validateAll()` does not abort.

### Reading a dynamic stream

When `IOStream::readStream()` processes a field from a dynamic stream, it calls
the private method `registerAndReadDynamicField()` instead of the standard
`Field::get()` + `readFieldData()` path:

```c++
Error DynErr = registerAndReadDynamicField(InFileID, FieldName);
CHECK_ERROR_ABORT(DynErr,
"IOStream::readStream: Failed to register/read dynamic field {} "
"in stream {}", FieldName, Name);
```

Any error from `registerAndReadDynamicField` causes an immediate abort with a
descriptive message.

`registerAndReadDynamicField()` performs the full registration pipeline for one
field:

1. Call `IO::getVarInfo()` to retrieve the field's dimension names, global
dimension lengths, and native data type from the open file.
2. Classify each dimension. A dimension is a mesh dimension if
`Dimension::exists(name) && Dimension::isDistributedDim(name)` is true;
otherwise it is a secondary dimension.
3. Require exactly one mesh dimension and at most one secondary dimension;
return an error otherwise.
4. If a secondary dimension is present: look it up in `Dimension::AllDims`.
If absent, create it with `Dimension::create(name, length)`. If it already
exists with the same length, reuse it silently. If it exists with a
different length, return an error.
5. Return an error if `Field::exists(fieldName)` is true (name collision).
6. Create the field with `Field::create()` using `TimeDependent=false`.
Because the first dimension is a distributed mesh dimension,
`Field::isDistributed()` will automatically return `true`.
7. Allocate a `HostArray1DR8` or `HostArray2DR8` and attach it to the field
via `Field::attachFieldData()`.
8. Obtain a SCORPIO decomposition for the read. For 1-D mesh fields a
temporary decomposition is built from the mesh dimension's global offsets
and destroyed after use. For 2-D fields `getOrCreateDynamicDecomp()` is
called to obtain a cached decomposition.
9. Read the data via `IO::readArray()`. SCORPIO converts from the native file
type to `R8` during the read.
10. Copy the resulting `R8` buffer into the field's Kokkos host array.

### Dynamic decomposition cache

2-D decompositions are cached in the private static member
```c++
static std::map<std::tuple<std::string, I4>, int> DynamicDecomps;
```
keyed on `(meshDimName, nSecondary)`. All dynamic decompositions use
`IOTypeR8`. The global offset for local mesh index `j` (0-based global index
`globalJ`) at secondary index `r` is `globalJ * nSecondary + r`, which matches
the row-major layout of `HostArray2DR8`.

Cached decompositions survive across restarts within a single process lifetime
and are freed when `IOStream::finalize()` is called.

### `IO::getVarInfo()`

The helper function
```c++
Error IO::getVarInfo(int FileID,
const std::string &VarName,
int &NVarDims,
std::vector<std::string> &DimNames,
std::vector<I4> &DimLengths,
IO::IODataType &NativeType);
```
wraps `PIOc_inq_varndims`, `PIOc_inq_vartype`, `PIOc_inq_vardimid`,
`PIOc_inq_dimname`, and `PIOc_inq_dimlen` to retrieve all metadata needed to
classify and register a dynamic field. It is declared in `IO.h` and follows
the same error-return conventions as the other IO functions.

### `IOStream::readAllDynamic()`

All dynamic streams are read through a single call:
```c++
Error DynErr = IOStream::readAllDynamic(ModelClock);
CHECK_ERROR_ABORT(DynErr, "Error reading dynamic input streams");
```
The method iterates `AllStreams` and calls `readStream()` for every stream with
`DynamicFields=true`. Users can define any number of dynamic streams in
`omega.yml` and they will all be read automatically without modifying source
code.

In `ocnInit`, this call is placed in `initOmegaModulesImpl()` between
`HorzMesh::init()` and `VertCoord::init()`. Application code that uses
`initOmegaModules()` automatically gets this behavior. Standalone applications
that perform their own initialization must call `readAllDynamic()` explicitly
after `HorzMesh::init()`.

### Initialization ordering

`registerAndReadDynamicField()` classifies mesh dimensions by checking
`Dimension::isDistributedDim()`, which returns `true` only for dimensions that
were registered during mesh initialization. Dynamic stream reads must therefore
occur after `HorzMesh::init()` and before any code that depends on the
resulting fields. The recommended sequence is:

```c++
Decomp::init();
HorzMesh::init(); // registers NCells, NEdges, NVertices
Error DynErr = IOStream::readAllDynamic(ModelClock); // reads all dynamic streams
CHECK_ERROR_ABORT(DynErr, "Error reading dynamic input streams");
// ... construct analysis operators or other consumers ...
```
3 changes: 3 additions & 0 deletions components/omega/doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ userGuide/Error
userGuide/Field
userGuide/IO
userGuide/IOStreams
userGuide/DynamicInputStreams
userGuide/Halo
userGuide/HorzMesh
userGuide/HorzOperators
Expand Down Expand Up @@ -81,6 +82,7 @@ devGuide/Error
devGuide/Field
devGuide/IO
devGuide/IOStreams
devGuide/DynamicInputStreams
devGuide/Halo
devGuide/HorzMesh
devGuide/HorzOperators
Expand Down Expand Up @@ -113,6 +115,7 @@ design/Config
design/DataTypes
design/Decomp
design/Driver
design/DynamicInputStreams
design/EOS
design/Error
design/Halo
Expand Down
79 changes: 79 additions & 0 deletions components/omega/doc/userGuide/DynamicInputStreams.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
(omega-user-dynamicinputstreams)=

## Dynamic Input Streams

A regular [IOStream](#omega-user-iostreams) requires every field listed in its
`Contents` to be pre-registered in Omega's field registry before the stream is
read. Dynamic input streams lift this restriction. When a stream is marked
with `DynamicFields: true`, Omega inspects the input file at read time,
discovers each field's dimensions and type, allocates storage, and registers the
field automatically.

The primary use case is weight fields used by analysis operators — region masks,
transect edge-sign arrays, and similar arrays whose names and secondary dimensions
are not fixed at compile time. After being registered, dynamic fields are
indistinguishable from any other Omega field and can be accessed by name through
the standard field registry.

### Configuration

Dynamic streams are placed in the same `IOStreams` section of the Omega input
configuration file as all other streams. The only additional option is
`DynamicFields: true`:

```yaml
Omega:
IOStreams:
MocMasksAndTransects:
UsePointerFile: false
Filename: /path/to/oQU240_mocBasinsAndTransects.nc
Mode: read
Freq: 1
FreqUnits: OnStartup
DynamicFields: true
Contents:
- MocCellMasks
- MocEdgeSigns
```

The names in `Contents` must match the variable names in the netCDF file
exactly. These names become the Omega-internal names used to retrieve the
fields after reading.

### Constraints on dynamic fields

Each field in a dynamic stream must satisfy the following constraints.
Violations abort initialization with a descriptive error message.

- **Exactly one mesh dimension**: the field must have exactly one dimension that
is a distributed Omega mesh dimension (`NCells`, `NEdges`, or `NVertices`).
Fields that lack a mesh dimension (e.g. a 1-D region-only array) are not
supported.

- **At most one secondary dimension**: beyond the mesh dimension, at most one
additional non-mesh dimension (e.g. `NMocBasins`) is allowed.

- **Unique field names**: a dynamic field name must not already exist in
Omega's field registry, whether from another dynamic stream or from model
state variables.

- **Consistent secondary dimension sizes**: if two streams reference a
secondary dimension by the same name (e.g. `NMocBasins`), the dimension
length must be identical in both files. Use descriptive, unique dimension
names to avoid unintended conflicts between unrelated streams.

- **Storage type**: all dynamic fields are stored as 64-bit floating-point
(`R8`) regardless of the native type in the file. Integer (`I4`, `I8`) and
single-precision (`R4`) fields are promoted to `R8` when read.

- **Re-read on every initialization**: dynamic fields are not written to restart
files and must be re-read from the original input file on every model start,
including restarts.

### Initialization ordering

Dynamic streams are read automatically during `ocnInit` via
`IOStream::readAllDynamic()`, which is called after mesh initialization.
No explicit read call is needed in the configuration or in model code.
Adding a new `DynamicFields: true` stream to `omega.yml` is sufficient for
it to be discovered and read on the next model start.
74 changes: 74 additions & 0 deletions components/omega/src/base/IO.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -768,6 +768,80 @@ Error readArray(void *Array, // [out] array to be read

} // End IOReadArray

//------------------------------------------------------------------------------
// Queries a variable's dimension names, global lengths, and native data type
// from an open file. Returns an error code if the variable is not found or
// if dimension metadata cannot be read.
Error getVarInfo(
int FileID, // [in] ID of open file
const std::string &VarName, // [in] variable name to query
int &NVarDims, // [out] number of dimensions
std::vector<std::string> &DimNames, // [out] name of each dimension
std::vector<I4> &DimLengths, // [out] global length of each dim
IODataType &NativeType // [out] native data type in file
) {

Error Err;
int PIOErr = 0;

// Get variable ID
int VarID = -1;
PIOErr = PIOc_inq_varid(FileID, VarName.c_str(), &VarID);
if (PIOErr != PIO_NOERR)
RETURN_ERROR(Err, ErrorCode::Fail,
"IO::getVarInfo: Variable {} not found in file", VarName);

// Get number of dimensions
PIOErr = PIOc_inq_varndims(FileID, VarID, &NVarDims);
if (PIOErr != PIO_NOERR)
RETURN_ERROR(Err, ErrorCode::Fail,
"IO::getVarInfo: Error getting ndims for variable {}",
VarName);

// Get native data type
nc_type VarType;
PIOErr = PIOc_inq_vartype(FileID, VarID, &VarType);
if (PIOErr != PIO_NOERR)
RETURN_ERROR(Err, ErrorCode::Fail,
"IO::getVarInfo: Error getting type for variable {}",
VarName);
NativeType = static_cast<IODataType>(VarType);

// Get dimension IDs
std::vector<int> DimIDs(NVarDims);
PIOErr = PIOc_inq_vardimid(FileID, VarID, DimIDs.data());
if (PIOErr != PIO_NOERR)
RETURN_ERROR(Err, ErrorCode::Fail,
"IO::getVarInfo: Error getting dimids for variable {}",
VarName);

// Get dimension names and lengths
DimNames.resize(NVarDims);
DimLengths.resize(NVarDims);
for (int IDim = 0; IDim < NVarDims; ++IDim) {
char DimName[PIO_MAX_NAME + 1] = {'\0'};
PIOErr = PIOc_inq_dimname(FileID, DimIDs[IDim], DimName);
if (PIOErr != PIO_NOERR)
RETURN_ERROR(
Err, ErrorCode::Fail,
"IO::getVarInfo: Error getting name for dim {} of variable {}",
IDim, VarName);
DimNames[IDim] = DimName;

PIO_Offset DimLen;
PIOErr = PIOc_inq_dimlen(FileID, DimIDs[IDim], &DimLen);
if (PIOErr != PIO_NOERR)
RETURN_ERROR(
Err, ErrorCode::Fail,
"IO::getVarInfo: Error getting length for dim {} of variable {}",
IDim, VarName);
DimLengths[IDim] = static_cast<I4>(DimLen);
}

return Err;

} // End getVarInfo

//------------------------------------------------------------------------------
// Reads a non-distributed variable. Uses a void pointer for generic interface.
// All arrays are assumed to be in contiguous storage. Returns an error code so
Expand Down
12 changes: 12 additions & 0 deletions components/omega/src/base/IO.h
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,18 @@ Error readArray(void *Array, ///< [out] array to be read
int Frame = -1 ///< [in] opt frame if multiple time slices
);

/// Queries a variable's dimension names, global lengths, and native data type
/// from an open file. Returns a non-zero error code if the variable is not
/// found or if any dimension metadata cannot be read.
Error getVarInfo(
int FileID, ///< [in] ID of open file
const std::string &VarName, ///< [in] variable name to query
int &NVarDims, ///< [out] number of dimensions
std::vector<std::string> &DimNames, ///< [out] name of each dimension
std::vector<I4> &DimLengths, ///< [out] global length of each dim
IODataType &NativeType ///< [out] native data type in file
);

/// Reads a non-distributed variable. We use a void pointer here to create
/// a generic interface for all types. Arrays are assumed to be in contiguous
/// storage so the arrays of any dimension are treated as a 1-d array with
Expand Down
Loading
Loading