-
Notifications
You must be signed in to change notification settings - Fork 95
Expand file tree
/
Copy pathtutorial.py
More file actions
304 lines (238 loc) · 11 KB
/
tutorial.py
File metadata and controls
304 lines (238 loc) · 11 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
#!/usr/bin/env python
# coding: utf-8
# # Tutorial: Advection Equation with data driven DeepONet
#
# [](https://colab.research.google.com/github/mathLab/PINA/blob/master/tutorials/tutorial24/tutorial.ipynb)
#
#
# > ##### ⚠️ ***Before starting:***
# > We assume you are already familiar with the concepts covered in the [Getting started with PINA](https://mathlab.github.io/PINA/_tutorial.html#getting-started-with-pina) tutorials. If not, we strongly recommend reviewing them before exploring this advanced topic.
#
# In this tutorial, we demonstrate how to solve the advection operator learning problem using `DeepONet`. We follow the original formulation of Lu *et al.* in [*DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operator*](https://arxiv.org/abs/1910.03193).
#
# We begin by importing the necessary modules.
# In[1]:
## routine needed to run the notebook on Google Colab
try:
import google.colab
IN_COLAB = True
except:
IN_COLAB = False
if IN_COLAB:
get_ipython().system('pip install "pina-mathlab[tutorial]"')
# get the data
get_ipython().system('mkdir "data"')
get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_input_testing.pt" -O "data/advection_input_testing.pt"')
get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_input_training.pt" -O "data/advection_input_training.pt"')
get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_output_testing.pt" -O "data/advection_output_testing.pt"')
get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_output_training.pt" -O "data/advection_output_training.pt"')
import matplotlib.pyplot as plt
import torch
import warnings
from pina import Trainer, LabelTensor
from pina.model import FeedForward, DeepONet
from pina.solver import SupervisedSolver
from pina.problem.zoo import SupervisedProblem
from pina.loss import LpLoss
warnings.filterwarnings("ignore")
# ## Advection problem and data preparation
#
# We consider the 1D advection equation
# $$
# \frac{\partial u}{\partial t} + \frac{\partial u}{\partial x} = 0,
# \quad x \in [0,2], \; t \in [0,1],
# $$
# with periodic boundary conditions. The initial condition is chosen as a Gaussian pulse centered at a random location
# $\mu \sim U(0.05, 1)$ and with variance $\sigma^2 = 0.02$:
# $$
# u_0(x) = \frac{1}{\sqrt{\pi\sigma^2}} e^{-\frac{(x - \mu)^2}{2\sigma^2}},
# \quad x \in [0,2].
# $$
#
# Our goal is to learn the operator
# $$
# \mathcal{G}: u_0(x) \mapsto u(x, t = \delta) = u_0(x - \delta),
# $$
# with $\delta = 0.5$ for this tutorial. In practice, this means learning a mapping from the initial condition to the solution at a fixed later time.
# The dataset therefore consists of trajectories where inputs are initial profiles and outputs are the same profiles shifted by $\delta$.
#
# The data has shape `[T, Nx, D]`, where:
# - `T` — number of trajectories (100 for training, 1000 for testing),
# - `Nx` — number of spatial grid points (fixed at 100),
# - `D = 1` — single scalar field value `u`.
#
# We now load the dataset and visualize sample trajectories.
# In[2]:
# loading training data
data_0_training = LabelTensor(
torch.load("data/advection_input_training.pt", weights_only=False),
labels="u0",
)
data_dt_training = LabelTensor(
torch.load("data/advection_output_training.pt", weights_only=False),
labels="u",
)
# loading testing data
data_0_testing = LabelTensor(
torch.load("data/advection_input_testing.pt", weights_only=False),
labels="u0",
)
data_dt_testing = LabelTensor(
torch.load("data/advection_output_testing.pt", weights_only=False),
labels="u",
)
# The data are loaded, let's visualize a few of the initial conditions!
# In[3]:
# storing the discretization in space:
Nx = data_0_training.shape[1]
for idx, i in enumerate(torch.randint(0, data_0_training.shape[0] - 1, (3,))):
u0 = data_0_training[int(i)].extract("u0")
u = data_dt_training[int(i)].extract("u")
x = torch.linspace(
0, 2, Nx
) # the discretization in the spatial dimension is fixed
plt.subplot(3, 1, idx + 1)
plt.plot(x, u0.flatten(), label=rf"$u_0(x)$")
plt.plot(x, u.flatten(), label=rf"$u(x, t=\delta)$")
plt.xlabel(rf"$x$")
plt.tight_layout()
plt.legend()
# Great — we have generated a traveling wave and visualized a few samples. Next, we will use this data to train a `DeepONet`.
#
# ## DeepONet
#
# The standard `DeepONet` architecture consists of two subnetworks: a **branch** network and a **trunk** network (see figure below).
#
# <center>
# <img src="http://raw.githubusercontent.com/mathLab/PINA/master/tutorials/static/deeponet.png" alt="image from: Moya, C.; Lin, G. Fed-DeepONet: Stochastic Gradient-Based Federated Training of Deep Operator Networks. Algorithms 2022, 15, 325. https://doi.org/10.3390/a15090325" style="width:50%; height:50%;"/>
# </center>
# <center>
# <i>Image source: Moya & Lin (2022)</i>
# </center>
#
# In our setting:
# - The **branch network** receives the initial condition of each trajectory, with input shape `[B, Nx]` — where `B` is the batch size and `Nx` the spatial discretization points of the field at \( t = 0 \).
# - The **trunk network** takes input of shape `[B, 1]`, corresponding to the location at which we evaluate the solution (in this 1D case, the spatial coordinate).
#
# Together, these networks learn the mapping from the initial field to the solution at a later time.
#
# We now define and train the model for the advection problem.
# In[4]:
problem = SupervisedProblem(
input_=data_0_training,
output_=data_dt_training,
input_variables=data_0_training.labels,
output_variables=data_dt_training.labels,
)
# We now proceede to create the trunk and branch networks.
# In[5]:
# create Trunk model
class TrunkNet(torch.nn.Module):
def __init__(self, **kwargs):
super().__init__()
self.trunk = FeedForward(**kwargs)
def forward(self, x):
t = (
torch.zeros(size=(x.shape[0], 1), requires_grad=False) + 0.5
) # create an input of only 0.5
return self.trunk(t)
# create Branch model
class BranchNet(torch.nn.Module):
def __init__(self, **kwargs):
super().__init__()
self.branch = FeedForward(**kwargs)
def forward(self, x):
return self.branch(x.flatten(1))
# The `TrunkNet` is implemented as a standard `FeedForward` network with a slightly modified `forward` method. In this case, the trunk network simply outputs a tensor filled with the value \(0.5\), repeated for each trajectory — corresponding to evaluating the solution at time \(t = 0.5\).
#
# The `BranchNet` is also a `FeedForward` network, but its `forward` pass first flattens the input along the last dimension. This produces a vector of length `Nx`, representing the sampled initial condition at the sensor locations.
#
# With both subnetworks defined, we can now instantiate the DeepONet model using the `DeepONet` class from `pina.model`.
# In[6]:
# initialize truck and branch net
trunk = TrunkNet(
layers=[256] * 4,
output_dimensions=Nx,
input_dimensions=1, # time variable dimension
func=torch.nn.ReLU,
)
branch = BranchNet(
layers=[256] * 4,
output_dimensions=Nx,
input_dimensions=Nx, # spatial variable dimension
func=torch.nn.ReLU,
)
# initialize the DeepONet model
model = DeepONet(
branch_net=branch,
trunk_net=trunk,
input_indeces_branch_net=["u0"],
input_indeces_trunk_net=["u0"],
reduction="id",
aggregator="*",
)
# The aggregation and reduction functions combine the outputs of the branch and trunk networks. In this example, their outputs are multiplied element-wise, and no reduction is applied — meaning the final output has the same dimensionality as each network’s output.
#
# We train the model using a `SupervisedSolver` with an `MSE` loss. Below, we first define the solver and then the trainer used to run the optimization.
# In[ ]:
# define solver
solver = SupervisedSolver(problem=problem, model=model)
# define the trainer and train
trainer = Trainer(
solver=solver, max_epochs=200, enable_model_summary=False, accelerator="cpu"
)
trainer.train()
# Let's see the final train and test errors:
# In[8]:
# the l2 error
l2 = LpLoss()
with torch.no_grad():
train_err = l2(trainer.solver(data_0_training), data_dt_training)
test_err = l2(trainer.solver(data_0_testing), data_dt_testing)
print(f"Training error: {float(train_err.mean()):.2%}")
print(f"Testing error: {float(test_err.mean()):.2%}")
# We can see that the testing error is slightly higher than the training one, maybe due to overfitting. We now plot some results trajectories.
# In[9]:
for i in [1, 2, 3]:
plt.subplot(3, 1, i)
plt.plot(
torch.linspace(0, 2, Nx),
solver(data_0_training)[10 * i].detach().flatten(),
label=r"$u_{NN}$",
)
plt.plot(
torch.linspace(0, 2, Nx),
data_dt_training[10 * i].extract("u").flatten(),
label=r"$u$",
)
plt.xlabel(r"$x$")
plt.legend(loc="upper right")
plt.show()
# As we can see, they are barely indistinguishable. To better understand the difference, we now plot the residuals, i.e. the difference of the exact solution and the predicted one.
# In[10]:
for i in [1, 2, 3]:
plt.subplot(3, 1, i)
plt.plot(
torch.linspace(0, 2, Nx),
data_dt_training[10 * i].extract("u").flatten()
- solver(data_0_training)[10 * i].detach().flatten(),
label=r"$u - u_{NN}$",
)
plt.xlabel(r"$x$")
plt.tight_layout()
plt.legend(loc="upper right")
# ## What's Next?
#
# We have seen a simple example of using `DeepONet` to learn the advection operator. This only scratches the surface of what neural operators can do. Here are some suggested directions to continue your exploration:
#
# 1. **Train on more complex PDEs**: Extend beyond the advection equation to more challenging operators, such as diffusion or nonlinear conservation laws.
#
# 2. **Increase training scope**: Experiment with larger datasets, deeper networks, and longer training schedules to unlock the full potential of neural operator learning.
#
# 3. **Generalize to the full advection operator**: Train the model to learn the general operator $\mathcal{G}_t: u_0(x) \mapsto u(x,t) = u_0(x - t)$ so the network predicts solutions for arbitrary times, not just a single fixed horizon.
#
# 4. **Investigate architectural variations**: Compare different operator learning architectures (e.g., Fourier Neural Operators, Physics-Informed DeepONets) to see how they perform on similar problems.
#
# 5. **...and much more!**: From adding noise robustness to testing on real scientific datasets, the space of possibilities is wide open.
#
# For more resources and tutorials, check out the [PINA Documentation](https://mathlab.github.io/PINA/).