-
Notifications
You must be signed in to change notification settings - Fork 171
Update the documentation to use parquet output #2607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
ffe0ebf
9ebb653
0218c28
4e7de3e
bc653f1
5daec3b
2a07ced
b8c5477
2d438c5
32a82fa
19fbd8d
e74672f
db9f983
ac2a830
de464e5
57ccf6f
b2bde50
55493a9
bab4d5d
b28665c
7184e1f
e7e37ef
54c829a
8626d48
3693329
81f127b
9fcb5bf
41ed3d8
b64a00e
5e0fc7f
511ce10
04d8676
6695187
b4e2214
a5bdd31
663f80c
5be22aa
002b8f2
3c52647
39ab122
ae375b3
ebf2b54
8379a88
c10e741
b986c81
4d5d670
e005595
f738ca7
ea9fe51
4a222e1
8b380a0
b049a73
59ed170
3f326ce
1a48b44
9e3b88a
9084910
3ad3f10
daad8c9
f811b8d
4dd08b9
b06a051
aa9fbd1
b57e78f
9aa1459
d521ad3
3d0c55d
580d5fb
4571bca
3d29c72
45c9cf0
d47202b
31dabc0
31a49e8
3b2a166
1b35bf9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -13,12 +13,12 @@ read more, we have a [concepts overview](./explanation_concepts.md) discussing t | |||||
|
|
||||||
| ## Imports | ||||||
|
|
||||||
| Parcels depends on `xarray`, expecting inputs in the form of [`xarray.Dataset`](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html) | ||||||
| and writing output files that can be read with xarray. | ||||||
| Parcels depends on `xarray`, expecting inputs in the form of [`xarray.Dataset`](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html). Output files can be read with `pandas`. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| ```{code-cell} | ||||||
| import numpy as np | ||||||
| import xarray as xr | ||||||
| import polars as pl | ||||||
| import parcels | ||||||
| import parcels.tutorial | ||||||
| ``` | ||||||
|
|
@@ -123,11 +123,11 @@ Before starting the simulation, we must define where and how frequent we want to | |||||
| We can define this in a {py:obj}`parcels.ParticleFile` object: | ||||||
|
|
||||||
| ```{code-cell} | ||||||
| output_file = parcels.ParticleFile("output-quickstart.zarr", outputdt=np.timedelta64(1, "h")) | ||||||
| output_file = parcels.ParticleFile("output-quickstart.parquet", outputdt=np.timedelta64(1, "h")) | ||||||
| ``` | ||||||
|
|
||||||
| The output files are in `.zarr` [format](https://zarr.readthedocs.io/en/stable/), which can be read by `xarray`. | ||||||
| See the [Parcels output tutorial](./tutorial_output.ipynb) for more information on the zarr format. We want to choose | ||||||
| The output files are in `.parquet` [format](https://parquet.apache.org/), which can be read by `polars`. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would this be a good place to link to Polars?
Suggested change
I don't think we link to it yet in the docs here |
||||||
| See the [Parcels output tutorial](./tutorial_output.ipynb) for more information on the parquet format. We want to choose | ||||||
| the `outputdt` argument so that it captures the smallest timescales of our interest. | ||||||
|
|
||||||
| ## Run Simulation: `ParticleSet.execute()` | ||||||
|
|
@@ -155,23 +155,22 @@ pset.execute( | |||||
| To start analyzing the trajectories computed by **Parcels**, we can open the `ParticleFile` using `xarray`: | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needs to be updated from "xarray" |
||||||
|
|
||||||
| ```{code-cell} | ||||||
| ds_particles = xr.open_zarr("output-quickstart.zarr") | ||||||
| ds_particles | ||||||
| df = parcels.read_particlefile("output-quickstart.parquet") | ||||||
| df | ||||||
| ``` | ||||||
|
|
||||||
| The 10 particle trajectories are stored along the `trajectory` dimension, and each trajectory contains 25 observations | ||||||
| (initial values + 24 hourly timesteps) along the `obs` dimension. The [working with Parcels output tutorial](./tutorial_output.ipynb) | ||||||
| provides more detail about the dataset and how to analyse it. | ||||||
| The file contains 250 rows: 25 observations for the 10 particle trajectories. | ||||||
| The [working with Parcels output tutorial](./tutorial_output.ipynb) provides more detail about the dataset and how to analyse it. | ||||||
|
|
||||||
| Let's verify that Parcels has computed the advection of the virtual particles! | ||||||
|
|
||||||
| ```{code-cell} | ||||||
| import matplotlib.pyplot as plt | ||||||
|
|
||||||
| # plot positions and color particles by number of observation | ||||||
| scatter = plt.scatter(ds_particles.lon.T, ds_particles.lat.T, c=np.repeat(ds_particles.obs.values,npart)) | ||||||
| plt.scatter(ds_particles.lon[:,0],ds_particles.lat[:,0],facecolors="none",edgecolors='r') # starting positions | ||||||
| plt.scatter(lon,lat,facecolors="none",edgecolors='r') # starting positions | ||||||
| # plot positions and color particles by time | ||||||
| scatter = plt.scatter(df['lon'], df['lat'], c=df['time']) | ||||||
| plt.scatter(df['lon'][:npart], df['lat'][:npart], facecolors="none", edgecolors='r') # starting positions | ||||||
| plt.scatter(lon, lat, facecolors="none", edgecolors='r') # starting positions | ||||||
| plt.xlim(31,33) | ||||||
| plt.ylabel("Latitude [deg N]") | ||||||
| plt.ylim(-33,-30) | ||||||
|
|
@@ -196,7 +195,7 @@ location! | |||||
| ```{code-cell} | ||||||
| :tags: [hide-output] | ||||||
| # set up output file | ||||||
| output_file = parcels.ParticleFile("output-backwards.zarr", outputdt=np.timedelta64(1, "h")) | ||||||
| output_file = parcels.ParticleFile("output-backwards.parquet", outputdt=np.timedelta64(1, "h")) | ||||||
|
|
||||||
| # execute simulation in backwards time | ||||||
| pset.execute( | ||||||
|
|
@@ -210,10 +209,11 @@ pset.execute( | |||||
| When we check the output, we can see that the particles have returned to their original position! | ||||||
|
|
||||||
| ```{code-cell} | ||||||
| ds_particles_back = xr.open_zarr("output-backwards.zarr") | ||||||
| df_back = parcels.read_particlefile("output-backwards.parquet") | ||||||
|
|
||||||
| scatter = plt.scatter(ds_particles_back.lon.T, ds_particles_back.lat.T, c=np.repeat(ds_particles_back.obs.values,npart)) | ||||||
| plt.scatter(ds_particles_back.lon[:,0],ds_particles_back.lat[:,0],facecolors="none",edgecolors='r') # starting positions | ||||||
| scatter = plt.scatter(df_back['lon'], df_back['lat'], c=df_back['time']) | ||||||
| particles_at_start = df_back.filter(pl.col("time") == df_back["time"].min()) | ||||||
| plt.scatter(particles_at_start['lon'], particles_at_start['lat'], facecolors="none", edgecolors='r') # starting positions | ||||||
| plt.xlabel("Longitude [deg E]") | ||||||
| plt.xlim(31,33) | ||||||
| plt.ylabel("Latitude [deg N]") | ||||||
|
|
@@ -226,6 +226,6 @@ Using Euler forward advection, the final positions are equal to the original pos | |||||
|
|
||||||
| ```{code-cell} | ||||||
| # testing that final location == original location | ||||||
| np.testing.assert_almost_equal(ds_particles_back['lat'].values[:,-1],ds_particles['lat'].values[:,0], 2) | ||||||
| np.testing.assert_almost_equal(ds_particles_back['lon'].values[:,-1],ds_particles['lon'].values[:,0], 2) | ||||||
| np.testing.assert_almost_equal(particles_at_start["lat"], lat, 2) | ||||||
| np.testing.assert_almost_equal(particles_at_start['lon'], lon, 2) | ||||||
| ``` | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone through this - really nice update! I think its quite clear