PINA/tutorials/tutorial24/tutorial.py at master · mathLab/PINA

History

304 lines (238 loc) · 11 KB

Raw

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

#!/usr/bin/env python

# coding: utf-8

# # Tutorial: Advection Equation with data driven DeepONet

# [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mathLab/PINA/blob/master/tutorials/tutorial24/tutorial.ipynb)

# > ##### ⚠️ ***Before starting:***

# > We assume you are already familiar with the concepts covered in the [Getting started with PINA](https://mathlab.github.io/PINA/_tutorial.html#getting-started-with-pina) tutorials. If not, we strongly recommend reviewing them before exploring this advanced topic.

# In this tutorial, we demonstrate how to solve the advection operator learning problem using `DeepONet`. We follow the original formulation of Lu *et al.* in [*DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operator*](https://arxiv.org/abs/1910.03193).

# We begin by importing the necessary modules.

# In[1]:

## routine needed to run the notebook on Google Colab

try:

import google.colab

IN_COLAB = True

except:

IN_COLAB = False

if IN_COLAB:

get_ipython().system('pip install "pina-mathlab[tutorial]"')

# get the data

get_ipython().system('mkdir "data"')

get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_input_testing.pt" -O "data/advection_input_testing.pt"')

get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_input_training.pt" -O "data/advection_input_training.pt"')

get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_output_testing.pt" -O "data/advection_output_testing.pt"')

get_ipython().system('wget "https://github.com/mathLab/PINA/raw/refs/heads/master/tutorials/tutorial24/data/advection_output_training.pt" -O "data/advection_output_training.pt"')

import matplotlib.pyplot as plt

import torch

import warnings

from pina import Trainer, LabelTensor

from pina.model import FeedForward, DeepONet

from pina.solver import SupervisedSolver

from pina.problem.zoo import SupervisedProblem

from pina.loss import LpLoss

warnings.filterwarnings("ignore")

# ## Advection problem and data preparation

# We consider the 1D advection equation

# $$

# \frac{\partial u}{\partial t} + \frac{\partial u}{\partial x} = 0,

# \quad x \in [0,2], \; t \in [0,1],

# $$

# with periodic boundary conditions. The initial condition is chosen as a Gaussian pulse centered at a random location

# $\mu \sim U(0.05, 1)$ and with variance $\sigma^2 = 0.02$:

# $$

# u_0(x) = \frac{1}{\sqrt{\pi\sigma^2}} e^{-\frac{(x - \mu)^2}{2\sigma^2}},

# \quad x \in [0,2].

# $$

# Our goal is to learn the operator

# $$

# \mathcal{G}: u_0(x) \mapsto u(x, t = \delta) = u_0(x - \delta),

# $$

# with $\delta = 0.5$ for this tutorial. In practice, this means learning a mapping from the initial condition to the solution at a fixed later time.

# The dataset therefore consists of trajectories where inputs are initial profiles and outputs are the same profiles shifted by $\delta$.

# The data has shape `[T, Nx, D]`, where:

# - `T` — number of trajectories (100 for training, 1000 for testing),

# - `Nx` — number of spatial grid points (fixed at 100),

# - `D = 1` — single scalar field value `u`.

# We now load the dataset and visualize sample trajectories.

# In[2]:

# loading training data

data_0_training = LabelTensor(

torch.load("data/advection_input_training.pt", weights_only=False),

labels="u0",

)

data_dt_training = LabelTensor(

torch.load("data/advection_output_training.pt", weights_only=False),

labels="u",

)

# loading testing data

data_0_testing = LabelTensor(

torch.load("data/advection_input_testing.pt", weights_only=False),

labels="u0",

)

data_dt_testing = LabelTensor(

torch.load("data/advection_output_testing.pt", weights_only=False),

labels="u",

)

# The data are loaded, let's visualize a few of the initial conditions!

# In[3]:

# storing the discretization in space:

Nx = data_0_training.shape[1]

for idx, i in enumerate(torch.randint(0, data_0_training.shape[0] - 1, (3,))):

u0 = data_0_training[int(i)].extract("u0")

u = data_dt_training[int(i)].extract("u")

x = torch.linspace(

0, 2, Nx

) # the discretization in the spatial dimension is fixed

plt.subplot(3, 1, idx + 1)

plt.plot(x, u0.flatten(), label=rf"$u_0(x)$")

plt.plot(x, u.flatten(), label=rf"$u(x, t=\delta)$")

plt.xlabel(rf"$x$")

plt.tight_layout()

plt.legend()

# Great — we have generated a traveling wave and visualized a few samples. Next, we will use this data to train a `DeepONet`.

# ## DeepONet

# The standard `DeepONet` architecture consists of two subnetworks: a **branch** network and a **trunk** network (see figure below).

# <center>

# <img src="http://raw.githubusercontent.com/mathLab/PINA/master/tutorials/static/deeponet.png" alt="image from: Moya, C.; Lin, G. Fed-DeepONet: Stochastic Gradient-Based Federated Training of Deep Operator Networks. Algorithms 2022, 15, 325. https://doi.org/10.3390/a15090325" style="width:50%; height:50%;"/>

# </center>

# <center>

# <i>Image source: Moya & Lin (2022)</i>

# </center>

# In our setting:

# - The **branch network** receives the initial condition of each trajectory, with input shape `[B, Nx]` — where `B` is the batch size and `Nx` the spatial discretization points of the field at $ t = 0 $.

# - The **trunk network** takes input of shape `[B, 1]`, corresponding to the location at which we evaluate the solution (in this 1D case, the spatial coordinate).

# Together, these networks learn the mapping from the initial field to the solution at a later time.

# We now define and train the model for the advection problem.

# In[4]:

problem = SupervisedProblem(

input_=data_0_training,

output_=data_dt_training,

input_variables=data_0_training.labels,

output_variables=data_dt_training.labels,

)

# We now proceede to create the trunk and branch networks.

# In[5]:

# create Trunk model

class TrunkNet(torch.nn.Module):

def __init__(self, **kwargs):

super().__init__()

self.trunk = FeedForward(**kwargs)

def forward(self, x):

t = (

torch.zeros(size=(x.shape[0], 1), requires_grad=False) + 0.5

) # create an input of only 0.5

return self.trunk(t)

# create Branch model

class BranchNet(torch.nn.Module):

def __init__(self, **kwargs):

super().__init__()

self.branch = FeedForward(**kwargs)

def forward(self, x):

return self.branch(x.flatten(1))

# The `TrunkNet` is implemented as a standard `FeedForward` network with a slightly modified `forward` method. In this case, the trunk network simply outputs a tensor filled with the value $0.5$, repeated for each trajectory — corresponding to evaluating the solution at time $t = 0.5$.

# The `BranchNet` is also a `FeedForward` network, but its `forward` pass first flattens the input along the last dimension. This produces a vector of length `Nx`, representing the sampled initial condition at the sensor locations.

# With both subnetworks defined, we can now instantiate the DeepONet model using the `DeepONet` class from `pina.model`.

# In[6]:

# initialize truck and branch net

trunk = TrunkNet(

layers=[256] * 4,

output_dimensions=Nx,

input_dimensions=1, # time variable dimension

func=torch.nn.ReLU,

)

branch = BranchNet(

layers=[256] * 4,

output_dimensions=Nx,

input_dimensions=Nx, # spatial variable dimension

func=torch.nn.ReLU,

)

# initialize the DeepONet model

model = DeepONet(

branch_net=branch,

trunk_net=trunk,

input_indeces_branch_net=["u0"],

input_indeces_trunk_net=["u0"],

reduction="id",

aggregator="*",

)

# The aggregation and reduction functions combine the outputs of the branch and trunk networks. In this example, their outputs are multiplied element-wise, and no reduction is applied — meaning the final output has the same dimensionality as each network’s output.

# We train the model using a `SupervisedSolver` with an `MSE` loss. Below, we first define the solver and then the trainer used to run the optimization.

# In[ ]:

# define solver

solver = SupervisedSolver(problem=problem, model=model)

# define the trainer and train

trainer = Trainer(

solver=solver, max_epochs=200, enable_model_summary=False, accelerator="cpu"

)

trainer.train()

# Let's see the final train and test errors:

# In[8]:

# the l2 error

l2 = LpLoss()

with torch.no_grad():

train_err = l2(trainer.solver(data_0_training), data_dt_training)

test_err = l2(trainer.solver(data_0_testing), data_dt_testing)

print(f"Training error: {float(train_err.mean()):.2%}")

print(f"Testing error: {float(test_err.mean()):.2%}")

# We can see that the testing error is slightly higher than the training one, maybe due to overfitting. We now plot some results trajectories.

# In[9]:

for i in [1, 2, 3]:

plt.subplot(3, 1, i)

plt.plot(

torch.linspace(0, 2, Nx),

solver(data_0_training)[10 * i].detach().flatten(),

label=r"$u_{NN}$",

)

plt.plot(

torch.linspace(0, 2, Nx),

data_dt_training[10 * i].extract("u").flatten(),

label=r"$u$",

)

plt.xlabel(r"$x$")

plt.legend(loc="upper right")

plt.show()

# As we can see, they are barely indistinguishable. To better understand the difference, we now plot the residuals, i.e. the difference of the exact solution and the predicted one.

# In[10]:

for i in [1, 2, 3]:

plt.subplot(3, 1, i)

plt.plot(

torch.linspace(0, 2, Nx),

data_dt_training[10 * i].extract("u").flatten()

- solver(data_0_training)[10 * i].detach().flatten(),

label=r"$u - u_{NN}$",

)

plt.xlabel(r"$x$")

plt.tight_layout()

plt.legend(loc="upper right")

# ## What's Next?

# We have seen a simple example of using `DeepONet` to learn the advection operator. This only scratches the surface of what neural operators can do. Here are some suggested directions to continue your exploration:

# 1. **Train on more complex PDEs**: Extend beyond the advection equation to more challenging operators, such as diffusion or nonlinear conservation laws.

# 2. **Increase training scope**: Experiment with larger datasets, deeper networks, and longer training schedules to unlock the full potential of neural operator learning.

# 3. **Generalize to the full advection operator**: Train the model to learn the general operator $\mathcal{G}_t: u_0(x) \mapsto u(x,t) = u_0(x - t)$ so the network predicts solutions for arbitrary times, not just a single fixed horizon.

# 4. **Investigate architectural variations**: Compare different operator learning architectures (e.g., Fourier Neural Operators, Physics-Informed DeepONets) to see how they perform on similar problems.

# 5. **...and much more!**: From adding noise robustness to testing on real scientific datasets, the space of possibilities is wide open.

# For more resources and tutorials, check out the [PINA Documentation](https://mathlab.github.io/PINA/).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

tutorial.py

Latest commit

History

tutorial.py

File metadata and controls