Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/metadata-catalog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ concurrency:

env:
TABLE_NAME: layer_definitions
S3_BASE_PATH: s3://riverscapes-athena/riverscapes_metadata/layer_definitions
S3_BASE_PATH: s3://riverscapes-athena/riverscapes_metadata/layer_definitions_raw/0.8/
ATHENA_DATABASE: default
ATHENA_RESULT_BUCKET: s3://riverscapes-athena-output/query-results/metadata # <-- ensure this exists
AWS_REGION: us-west-2
Expand Down
8 changes: 8 additions & 0 deletions DataExchangeScripts.code-workspace
Original file line number Diff line number Diff line change
Expand Up @@ -42,5 +42,13 @@
"**/.venv/**": true,
"**/__pycache__/**": true
},
"json.schemas": [
{
"fileMatch": [
"**/layer_definitions.json"
],
"url": "https://xml.riverscapes.net/riverscapes_metadata/schema/layer_definitions.schema.json"
}
]
}
}
25 changes: 25 additions & 0 deletions pipelines/README-AthenaOrganization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Riverscapes-Athena AWS Bucket Organization

> ![Note] 2026-01-15 This file in `data-exchange-scripts` repository `\pipelines\athena_readme.md`

Pipelines output for Athena should go in the Riverscapes-athena bucket in the data_exchange/ prefix.

Suggested naming convention, partitions & organization:

* folder for project type (machine name), assuming all data from the same type

## Contents 2026-01-15

It's kind of a mess:

* `2025conus-projects` - a "materialized view" so-to-speak, generated by a glue script daily, of project derived from other data in data_exchange folder
* project_types - a single handmade file used for tracking the waterfall of models in CONUS2025 RUNS
* projects - populated by dynamodb sync, I think
* riverscape_metrics - Oct 12 2025 scrape using `rme_to_athena\rme_to_athena_parquet.py` supplemented with more recent incremental update
* rs_metric_engine2 - Jan 15 enhancement of the riverscape_metrics parquet files with simplified geometry using `add_simplified_geom_pq.py`
* rs-context
* rsdynamics-metrics/
* rsdynamics/ - maybe we should have grouped them
* table_column_defs -- old version, deprecated & to be removed soon
* table_column_defs_v2 -- ditto
* test-double-geom -- testing if can have 2 geometries in one parquet file (A: yes). to be removed
11 changes: 11 additions & 0 deletions pipelines/rme_to_athena/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# RME to Athena Pipeline Changelog

## 1.1

* Added: New Parquet files generated from geopackage now includes geometry_simplified column
* Changed: Metadata layer_definitions.json updated to 0.8 schema

## 1.0

* Used for CONUS run scrape
* the parquet results were later augmented with a simplified geometry version using add_simplified_geom_pq.py (without going back to source data in data exchange)
2 changes: 1 addition & 1 deletion pipelines/rme_to_athena/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "1.0"
__version__ = "1.1"
Loading