I have a dataset in Postgres of boat locations on waterways. Here is a sample of the table:
| boat_id | ts | waterway_id |
|---|---|---|
| Boat_A | 2019-01-01 16:29:11 | WW_01 |
| Boat_A | 2019-01-01 17:03:04 | WW_02 |
| Boat_B | 2019-01-01 16:11:34 | WW_01 |
| Boat_B | 2019-01-01 16:13:45 | WW_01 |
| Boat_B | 2019-01-01 17:05:13 | WW_01 |
| Boat_C | 2019-01-01 16:03:00 | WW_01 |
| Boat_C | 2019-01-01 16:09:50 | WW_02 |
| Boat_C | 2019-01-01 16:16:22 | WW_01 |
| Boat_C | 2019-01-01 16:45:44 | WW_01 |
boat_id is the unique identification of the boat, ts is timestamp and water_id is the unique identifier of the waterway. I would like to know for each hour in the dataset how many boats passed each waterway. The result should look like this:
| waterway_id | report_ts | passage_count |
|---|---|---|
| WW_01 | 2019-01-01 00:00 | 3 |
| WW_01 | 2019-01-01 01:00 | 1 |
| ... | ... | ... |
| WW_01 | 2019-12-31 23:00 | 5 |
| WW_02 | 2019-01-01 00:00 | 13 |
| WW_02 | 2019-01-01 01:00 | 11 |
| ... | ... | ... |
The raw data contains the position of boats, not passages. Thus:
- Multiple datapoints of the same boat on the same waterway should be counted as a single passage.
- If a boat has been on another waterway and comes back it should be counted as another passage.
- If a boat is detected on the same waterway in multiple hours, without being on anther waterway in between, it should be counted as a single passage in the hour it was first detected. In the example data above, boat_A makes 1 passage on waterway WW_01 at 16h and 1 on WW_02 at 17h, boat_b makes 1 passages on WW_01 at 16h (there is no passage at 18h because it did not go to antoher waterway in between), boat_C makes 2 passages on waterway WW_01 at 16h and 1 passage on WW_02 at 16h. In a table (waterway-hour combinations with 0 passages do not have to be included in the result):
| waterway_id | report_ts | passage_count |
|---|---|---|
| WW_01 | 2019-01-01 16:00 | 4 |
| WW_02 | 2019-01-01 16:00 | 1 |
| WW_02 | 2019-01-01 17:00 | 1 |
What should the query to get this result look like? In my mind, it consists of two steps:
- Computing unique passages per boat per waterway
- Organizing these in a table as the example above
Fiddle here