chore: added wal/snapshot doc #25856

praveen-influx · 2025-01-17T13:28:10Z

Moving some of my work notes into a doc (might be handy to understand the wal/snapshotting process)

hiltontj

Looking good I just had a couple comments.

hiltontj · 2025-01-20T15:14:03Z

influxdb3_wal/wal.md

+                                                                   │                ┌────────────┐
+                                                                   └───────────────►│clear buffer│ (whatever snapshotted is removed)
+                                                                                    └────────────┘
+```


A useful addition to this diagram would be to show the entry point for writes from user, i.e., where do writes go from the user (wal buffer?), via an arrow. Otherwise, it is not clear on the order of operations. If you could connect the numbers from the steps described below to locations / arrows on the diagram, that would be helpful.

Good point - I'll try to link the steps to the diagram and add the incoming writes as well.

hiltontj · 2025-01-20T15:20:07Z

influxdb3_wal/wal.md

+   If going ahead with force snapshotting, pick all the wal periods in the tracker and find the max time from most recent wal period. This will be
+   used as the `end_time_marker` to evict data from query buffer. Because forcing a snapshot can be triggered when wal buffer is empty (even though
+   queryable buffer is full), we need to add `Noop` (a no-op WalOp) to the wal file to hold the snapshot details in wal file.
+


Add a header on this diagram, e.g.,

##### Forced snapshot

pauldix · 2025-01-24T17:14:21Z

influxdb3_wal/wal.md

+1. When _writes_ comes in, they go into a write batch in wal buffer. These batches are held per database and the batches keep track of min
+   and max times within each batch. These batches further hold per table chunks. This chunk is created by taking incoming rows and pinning
+   them to a period. It is done by `t - (t % gen_1_duration)`. If `gen_1_duration` is 10 mins, then all of the rows will be divided into
+   10 min chunks. As an example if there are rows for 10.29 and 10.35 then they both go into 2 separate chunks (10.20 and 10.30). And this


What is 10.29 and what is 10.35? Are those meant to be timestamps like for time 10:29 and time 10:35?

pauldix · 2025-01-24T17:16:54Z

influxdb3_wal/wal.md

+   them to a period. It is done by `t - (t % gen_1_duration)`. If `gen_1_duration` is 10 mins, then all of the rows will be divided into
+   10 min chunks. As an example if there are rows for 10.29 and 10.35 then they both go into 2 separate chunks (10.20 and 10.30). And this
+   10.20 and 10.30 are used later as the key in queryable buffer.
+2. Every flush interval, the wal buffer is flushed and all batches are written to to wal file (converts to wal content and gets min/max


You should also mention that the write request that came in had a oneshot channel created that gets called back on after the flush and the placement of that data into the queryable buffer, with then returns a success to the client.

pauldix · 2025-01-24T17:20:13Z

influxdb3_wal/wal.md

+
+   ```
+
+   If it is a normal snapshot, then leave one wal period (`3` in eg below) and pick the last one (`2` in eg below) max time used as `end_time_marker`


This isn't strictly true. What the WAL periods are looking for is that data written into the oldest WAL files have time stamps that fall into chunks that are no longer receiving writes. So if we have the default 10m gen1 chunks and we're always writing data with a time of "now" then we would only snapshot after we have the 10m chunk time go cold. If we have lagged collection, by say 1m, we won't snapshot until after that 10m wall clock time has passed + 1m, but we would only snapshot the wal files from before that time. So we'd likely leave behind 60 wal files, which is by design.

pauldix · 2025-01-24T17:23:29Z

influxdb3_wal/wal.md

+         ```
+
+            │
+            ├───10.20───────────►┌────────────────────────────┐


same question as above, is 10.20 meant to a timestamp like 10:20? If so it might make more sense to have 2025-01-24T10:20 so that it's more clear

pauldix · 2025-01-24T17:24:11Z

influxdb3_wal/wal.md

+
+            │
+            ├───10.20───────────►┌────────────────────────────┐
+            │                    │  chunk 10.20 - 10.29       │


If these are times, it's better expressed with [10:20 - 10:30) which indicates that the 30 is exclusive

chore: added wal/snapshot doc

b5f00f1

praveen-influx force-pushed the praveen/wal-docs branch from 8cc5d20 to b5f00f1 Compare January 17, 2025 13:29

praveen-influx requested a review from a team January 17, 2025 15:45

hiltontj reviewed Jan 20, 2025

View reviewed changes

pauldix reviewed Jan 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: added wal/snapshot doc #25856

chore: added wal/snapshot doc #25856

praveen-influx commented Jan 17, 2025

hiltontj left a comment

hiltontj Jan 20, 2025

praveen-influx Jan 20, 2025

hiltontj Jan 20, 2025

pauldix Jan 24, 2025

pauldix Jan 24, 2025

pauldix Jan 24, 2025

pauldix Jan 24, 2025

pauldix Jan 24, 2025


		```

		If it is a normal snapshot, then leave one wal period (`3` in eg below) and pick the last one (`2` in eg below) max time used as `end_time_marker`

chore: added wal/snapshot doc #25856

Are you sure you want to change the base?

chore: added wal/snapshot doc #25856

Conversation

praveen-influx commented Jan 17, 2025

hiltontj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment