Disaster Aggregates and Hazard Scores - DaedalMap Research Projects

Project snapshot

DaedalMap handles disaster data as one shared geographic system with hazard-specific methods layered on top. The chain is explicit: normalized event records, event-to-location impact relationships, county-scale aggregates, and then hazard-specific score layers built from those historical summaries. That structure keeps the packs queryable across hazards without pretending every hazard has the same source geometry or the same scoring logic.

Data and methods

Core layers in the workflow:

Events. Canonical hazard records such as earthquakes, tornadoes, floods, tsunamis, volcanoes, and hurricane storms or tracks.
Links. Shared event-to-event relationship records for source-supported causal chains such as aftershocks or an earthquake triggering a tsunami. These are not used as a catch-all for every multi-hazard comparison.
Event areas. The event_areas relationship tables that connect each event to the shared target geography it affected, usually counties or equivalent admin2 areas.
Aggregates. Published parquet lanes such as yearly.parquet, rolling_10y.parquet, and rolling_20y.parquet inside each hazard's aggregate target tree, today usually aggregates/admin2.
Hazard score methods. Separate interpretation layers that use the historical aggregate baselines plus hazard-specific thresholds or external frameworks where those exist.

Each hazard lands on one explicit target geography through event_areas. Today that is usually admin2. Aggregates are built on that declared base first and only rolled upward after that. State and country summaries therefore resolve from the same underlying base instead of being rebuilt with a different method at each zoom level.

Relationships between disasters follow a narrower rule. DaedalMap keeps formal links for source-supported event relationships, not for every pair of hazards that overlap in time or place. A triggered tsunami, an aftershock sequence, or a source-documented volcano-earthquake relationship belongs in the relationship layer. Broad questions such as which counties experience both earthquakes and wildfires belong in the aggregate layer instead.

The public assumptions behind that relationship layer are documented separately in Disaster Linking and Causal Chains.

Methodology highlights

Public discovery uses pack-facing aggregate source wrappers, while the physical parquet files live in explicit hazard-local lanes such as global/disasters/tornadoes/aggregates/admin2/rolling_10y.parquet. That split keeps pack metadata stable and keeps the published yearly and rolling outputs easy to inspect, rebuild, and QA directly.

The standard target is still the admin spine at admin2, but the builder contract is intentionally wider than that. When a hazard relationship table is built on another maintained geometry family, the same aggregate pipeline can target that family explicitly instead of pretending the data was county-based.

The standard aggregate family is yearly plus rolling windows. Yearly lanes answer exact-year questions and support animation. Rolling windows answer baseline questions such as what a county experienced over the last decade. Missing event years inside a valid dataset span count as zero-event years rather than deleting the county from the rolling output.

Historical aggregates and hazard scores stay separate. The aggregate layer answers where events occurred, how often they occurred, and over what period. Hazard scores interpret that history with hazard-specific thresholds or external frameworks such as return periods, exceedance probabilities, or engineering benchmarks. Scores stay per hazard. A flood score and an earthquake score do not claim the same underlying meaning.

The relationship layer also has two storage patterns on purpose. The shared cross-hazard table is the common place for formal parent-child links. Some hazards also carry hazard-native relationship fields inside their own event rows when the source already expresses that structure. The long-term target is one clear public contract for both, while keeping the modeling honest about where a relationship came from.

Current findings

The aggregate system already supports one stable cross-hazard question: what has this county experienced, and over what time window? That is enough to support yearly history, rolling baselines, and county-to-state rollups from one common geography model.

The scoring layer is narrower and more hazard-dependent. Some hazards can be interpreted directly from observed event history. Others need stronger external frameworks before a public score says more than "historical exposure baseline." The public method therefore separates observed history from modeled interpretation instead of blending them into one opaque number.

The disaster relationship contract is still evolving, but the method direction is now explicit. The preferred long-term shared schema is event-id based. The current compatibility layer still includes some loc-id-based link records while the data and QA system are being cleaned up. That transition is documented publicly so outside users can see where the system is stable today and where methodology feedback is especially useful.