-
Notifications
You must be signed in to change notification settings - Fork 58
Initial version of ODD monitoring code and examples #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: DynamicSampling
Are you sure you want to change the base?
Changes from all commits
508e838
bba415c
ac66f0a
cabd349
9189a67
263c151
9dbf85f
cde394f
2c55eb1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| # MODD | ||
|
|
||
| The MODD class receives a set of labeled traces and outputs a ODD Monitor. It implements the boxes on the right side of the following diagram: | ||
|
|
||
|  | ||
|
|
||
|
|
||
| ## VerifAI Interface | ||
|
|
||
| Given a specification $\varphi$, VerifAI uses the sampler, analyzer and simulator to generate a set of traces $\{\sigma_i, \ell_i\}_i$. | ||
|
|
||
| The MODD class receives the set of evaluated simulation traces $\{\sigma_i\}_i$, where each point $\sigma_i$ is defined by a features vector and a special feature namely the correctness of the specification, and generates a training dataset $\{\tau_i, \ell_i\}_i$, where $\tau_i$ is a vector and $\ell_i$ is a single value (Data Generation box in the diagram). | ||
|
|
||
| The MODD class uses then the training dataset $\{\tau_i, \ell_i'\}_i$ to train a monitor $M$ (Learner box). | ||
|
|
||
| The MODD class evaluates the monitor $M$ over some new simulations (Evaluation box). If the optimality objective is not met, the MODD will trigger the generation of new simulations to expand the training dataset and restart the training process. | ||
|
|
||
|
|
||
| ### Implementation details | ||
|
|
||
|
|
||
|
|
||
|
|
||
| The MODD receives the following inputs: | ||
| - datagen_params: Parameters required to generate the training dataset: | ||
| - preprocessing function | ||
| - labeling function | ||
| - saving directory | ||
| - trainer_params: Parameters required to train the ODD Monitor: | ||
| - model to be trained (sklearn, pytorch, etc.) | ||
| - training results saving directory | ||
| - trained model saving path | ||
| - eval_params: Parameters required to evaluate the ODD Monitor: | ||
| - evaluation method | ||
| - specification | ||
| - number of simulations to run | ||
| - number of steps per simulation | ||
| - evaluation results saving directory | ||
| - evaluation results of running the monitor on simulations | ||
| - evaluation results of running the system without the monitor on the same simulations | ||
| - scenes saving path | ||
| - sampling_params: Parameters required to specify how to make calls to a sampler to generate more data: | ||
| - sampler | ||
| - server_class | ||
| - server_options | ||
| - path to controller to be monitored | ||
| - global_params: Parameters to specify how many simulations to run per loop of the MODD generation process: | ||
| - initial number of simulations | ||
| - initial number of simulation steps | ||
| - number of simulations per refinement loop | ||
| - number of steps per refinement simulation | ||
| - iterations of the refinement loop | ||
|
|
||
|
|
||
|
|
||
|
|
||
| ## Running instructions | ||
| To get an MODD monitor, we assume access to an already trained controller. For our example, we trained the controller `examples/modd/carla/controller_cte_dist_130.pth`. Instructions for training other controllers are included at the end of this section. | ||
|
|
||
| ### Setup instructions for our example | ||
| - Create a virtual environment with python=3.9. | ||
| - Clone the repository and install VerifAI as usual: | ||
| `python -m pip install -e .` | ||
| - Change directory to `examples/modd/` | ||
| - Run the example: `python ./modd_learner_main.py`. | ||
|
|
||
| ### Instructions for training controllers | ||
| - Run `python data_generation.py [scenic_path] --data_dir [data_dir] --num_sim [num_sim] --num_steps [num_steps]` with the following parameters: | ||
| - `scenic_path`: path to the scenic file used to generate data. For our example, we used `followLeader_datagen.scenic` | ||
| - `data_dir`: Path to the folder where the training data will be saved. | ||
| - `num_sim`: Number of simulations. | ||
| - `num_steps`: Number of timesteps per simulation. | ||
| - Run `python controller_training.py --data_dir [data_dir] --model_dir [model_dir] --model_name [model_name]` with the following parameters: | ||
| - `data_dir`: Path to the folder where the training data was saved. | ||
| - `model_dir`: Path to the folder where the trained model will be saved. | ||
| - `model_name`: Name of the saved model file. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,225 @@ | ||
| import pandas as pd | ||
| import numpy as np | ||
| import os | ||
| import torch | ||
| import torchvision | ||
|
Comment on lines
+4
to
+5
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dependencies like these which are not already listed in |
||
| import random | ||
| import sys | ||
| import re | ||
| import cv2 | ||
|
|
||
| print("CUDA enabled:", torch.cuda.is_available()) | ||
|
|
||
|
|
||
| class resNet(torch.nn.Module): | ||
| """ | ||
| Use restnet from torchvision | ||
| """ | ||
|
|
||
| def __init__(self, layers="18", pre_trained=False): | ||
| super(resNet, self).__init__() | ||
| if layers == "18": | ||
| self.model = torchvision.models.resnet18(pretrained=pre_trained) | ||
| else: | ||
| raise NotImplementedError | ||
|
|
||
| def forward(self, x): | ||
| return self.model(x) | ||
|
|
||
|
|
||
|
|
||
| class CNN(torch.nn.Module): | ||
| def __init__(self, resnet=False, pretrained=False): | ||
| super(CNN, self).__init__() | ||
| if resnet: | ||
| self.model = resNet(pre_trained=pretrained) | ||
| else: | ||
| raise NotImplementedError | ||
| self.fc1 = torch.nn.Linear(1000,1024) | ||
| self.head = torch.nn.Linear(1024, 2) | ||
| print(self.model) | ||
|
|
||
| def forward(self, x): | ||
| x = self.model(x) | ||
| x = self.fc1(x) | ||
| h = self.head(x) | ||
| return h | ||
|
|
||
|
|
||
| def train_cnn( | ||
| train_dataset, | ||
| val_dataset, | ||
| save_path, | ||
| batch_size=128, | ||
| n_epochs=351, | ||
| lr=0.0005, | ||
| device="cuda", | ||
| resnet=True, | ||
| model_name="model" | ||
| ): | ||
| """ | ||
| Train a CNN on the given data | ||
| """ | ||
| model = CNN(resnet=resnet).to(device) | ||
| optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=1e-5) | ||
| scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau( | ||
| optimizer, factor=0.9, patience=10 | ||
| ) | ||
| criterion = torch.nn.MSELoss() | ||
| train_loader = torch.utils.data.DataLoader( | ||
| train_dataset, batch_size=batch_size, shuffle=True, drop_last=True | ||
| ) | ||
| test_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, drop_last=True) | ||
| best_val_loss = np.inf | ||
| model.eval() | ||
|
|
||
| model_parameters = filter(lambda p: p.requires_grad, model.parameters()) | ||
| params = sum([np.prod(p.size()) for p in model_parameters]) | ||
| print(f"Number of parameters: {params}") | ||
| with torch.no_grad(): | ||
| test_loss = 0 | ||
| for x_batch, y_batch in test_loader: | ||
| y_batch = y_batch.reshape((batch_size, 2)).to(device) | ||
| y_pred = model(x_batch.to(device)) | ||
| test_loss += criterion(y_pred, y_batch).item() | ||
| test_loss /= len(test_loader) | ||
| print(f"Epoch {-1}: val loss = {test_loss}") | ||
| for epoch in range(n_epochs): | ||
| model.train() | ||
| total_loss = 0 | ||
| for loaded_batch in train_loader: | ||
| x_batch, y_batch = loaded_batch | ||
| optimizer.zero_grad() | ||
| y_batch = y_batch.reshape((batch_size, 2)).to(device) | ||
| y_pred = model(x_batch.to(device)) | ||
| loss = criterion(y_pred, y_batch) | ||
| total_loss += loss.item() | ||
| loss.backward() | ||
| optimizer.step() | ||
| model.eval() | ||
| with torch.no_grad(): | ||
| test_loss = 0 | ||
| for loaded_batch in test_loader: | ||
| x_batch, y_batch = loaded_batch | ||
| y_batch = y_batch.reshape((batch_size, 2)).to(device) | ||
| y_pred = model(x_batch.to(device)) | ||
| test_loss += criterion(y_pred, y_batch).item() | ||
| test_loss /= len(test_loader) | ||
| scheduler.step(test_loss) | ||
| print( | ||
| f"Epoch {epoch}: train loss = {total_loss / len(train_loader)} val loss = {test_loss}" | ||
| ) | ||
| # save model | ||
| if test_loss < best_val_loss: | ||
| best_val_loss = test_loss | ||
| torch.save(model.state_dict(), os.path.join(save_path, f"{model_name}_{epoch}.pth")) | ||
| if (epoch % 50 == 0 or epoch == n_epochs-1) and epoch > 50 : | ||
| torch.save(model.state_dict(), os.path.join(save_path, f"{model_name}_{epoch}.pth")) | ||
| return model | ||
|
|
||
|
|
||
| from torch.utils.data import Dataset | ||
| from torchvision.io import read_image | ||
|
Comment on lines
+121
to
+122
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please move to the top of the file with the rest of the imports so that they are more easily found (likewise for the other imports later on). |
||
|
|
||
|
|
||
|
|
||
| class CustomDataset(Dataset): | ||
|
|
||
| def __init__(self, imgs, labels, transform=None): | ||
| super().__init__() | ||
| self.imgs = imgs | ||
| self.labels = labels | ||
| self.transform = transform | ||
|
|
||
| def __len__(self): | ||
| return min(len(self.imgs), len(self.labels)) | ||
|
|
||
| def __getitem__(self, index): | ||
| img = (torch.from_numpy(self.imgs[index]) / 255).permute(2,0,1)[[2,1,0],:,:] | ||
| label = self.labels[index] | ||
|
|
||
| if self.transform: | ||
| return self.transform(img), torch.FloatTensor([label]) | ||
| else: | ||
| return img, torch.FloatTensor([label]) | ||
|
|
||
|
|
||
| def sort_paths(folder_path): | ||
| orig_paths = [p for p in os.listdir(folder_path) if re.sub("[^0-9]","",p) != ""] | ||
| M = max([len(e) for e in orig_paths]) | ||
| m = min([len(e) for e in orig_paths]) | ||
| paths = ["0" * (M-len(e)) + e for e in orig_paths] | ||
| paths = sorted(paths) | ||
| for _ in range(M-m-1): | ||
| paths = [e if e[0] != "0" else e[1:] for e in paths] | ||
| paths = [folder_path+e if e[0] != "0" else folder_path+e[1:] for e in paths] | ||
| return paths | ||
|
|
||
|
|
||
|
|
||
| def load_data_np( | ||
| data_dir, | ||
| split_ratio=0.1, | ||
| seed=0, | ||
| ): | ||
| """ | ||
| Load data and split into train validation | ||
| """ | ||
| if data_dir[-1] != "/": | ||
| data_dir += "/" | ||
|
|
||
| dss = [] | ||
|
|
||
| paths_data_dirs = [data_dir + e + "/" for e in os.listdir(data_dir) ] | ||
| first = True | ||
| i0 = 0 | ||
| for sim in paths_data_dirs: | ||
| if len(os.listdir(sim)) > 0: | ||
| loaded_imgs = [cv2.imread(img) for img in sort_paths(sim)] | ||
| imgs = np.array(loaded_imgs)[:, 80:160, 40:600] | ||
| dist = np.expand_dims(np.load(sim + "dist.npz")["values"], axis=0) / 30 | ||
| cte = np.expand_dims(np.load(sim + "cte.npz")["values"], axis=0) | ||
| labels = np.transpose(np.concatenate((cte,dist))) | ||
| if first: | ||
| first = False | ||
| ds = CustomDataset(imgs, labels) | ||
| i0 = 1 | ||
| else: | ||
| dsi = CustomDataset(imgs, labels) | ||
| ds = torch.utils.data.ConcatDataset([ds,dsi]) | ||
| i0 +=1 | ||
| else: | ||
| print(f"Folder {sim} should be removed. No images in the folder") | ||
| print(f"Total of {len(ds)} images") | ||
|
|
||
| train_dataset, val_dataset = torch.utils.data.random_split(ds, [1-split_ratio, split_ratio]) | ||
|
|
||
| return train_dataset, val_dataset | ||
|
|
||
|
|
||
| import argparse | ||
|
|
||
| parser = argparse.ArgumentParser(description="Train CNN") | ||
| parser.add_argument("--data_dir", type=str, default="./training_data/") | ||
| parser.add_argument("--model_dir", type=str, default="./models") | ||
| parser.add_argument("--model_name", type=str, default="controller_cte_dist") | ||
| args = parser.parse_args() | ||
|
|
||
| if __name__ == "__main__": | ||
|
|
||
| import os | ||
|
|
||
| sys.setrecursionlimit(10000) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this really necessary? It's not a good sign to have to change the recursion limit, and nothing here looks recursive on a quick skim. |
||
|
|
||
| device="cuda" | ||
|
|
||
| os.makedirs(args.model_dir, exist_ok=True) | ||
| train_dataset, val_dataset = load_data_np(args.data_dir) | ||
| train_cnn( | ||
| train_dataset=train_dataset, | ||
| val_dataset=val_dataset, | ||
| save_path=args.model_dir, | ||
| resnet=True, | ||
| model_name=args.model_name, | ||
| ) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
|
|
||
| import math | ||
| import os.path | ||
| import sys | ||
| import os | ||
| import shutil | ||
| import argparse | ||
| import pickle | ||
| import numpy as np | ||
| import time | ||
| import scenic | ||
| from dotmap import DotMap | ||
| from PIL import Image | ||
| import cv2 | ||
|
|
||
| import pandas as pd | ||
|
Comment on lines
+12
to
+16
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These and some of the other imports don't appear to be used. |
||
|
|
||
|
|
||
|
|
||
| # Arg Parser | ||
| parser = argparse.ArgumentParser(description='modd',usage='later', formatter_class=argparse.ArgumentDefaultsHelpFormatter) | ||
|
|
||
| ## arguments | ||
| parser.add_argument('scenic_path', help='path to scenic file', metavar='scenic_path') | ||
| parser.add_argument("--data_dir", type=str, default="./training_data/") | ||
| parser.add_argument('--num_sim', help='number of simulations used per monitor', type=int, default=1) | ||
| parser.add_argument('--num_steps', help='number of steps per simulation',type=int,default=60) | ||
|
|
||
|
|
||
| args = parser.parse_args() | ||
|
|
||
|
|
||
|
|
||
| ## assignments | ||
|
|
||
| scenic_path = args.scenic_path | ||
| num_of_simulations = args.num_sim | ||
| num_of_steps = args.num_steps | ||
| data_dir = args.data_dir | ||
|
|
||
|
|
||
|
|
||
|
|
||
| last_folder = 0 | ||
|
|
||
|
|
||
| for i in range(last_folder, num_of_simulations): | ||
|
|
||
|
|
||
| # Initialize logger folder sstructure | ||
| if i==0: | ||
| if os.path.exists(data_dir): | ||
| shutil.rmtree(data_dir) | ||
| os.mkdir(data_dir) | ||
|
|
||
|
|
||
| sim_dir = f"{data_dir}{i}" | ||
| if not os.path.exists(sim_dir): | ||
| os.mkdir(sim_dir) | ||
|
|
||
| flag_good = False | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is never used. |
||
| print(f"Executing sim {i}") | ||
| try: | ||
| os.system(f"scenic -S {scenic_path} --count 1 --time {num_of_steps} --2d --param recordFolder {sim_dir} --param timeout 30") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using Scenic's Python API would probably be better here, but if you want to stick with the CLI, you should use |
||
| flag_good = True | ||
| except: | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Never use a bare |
||
| print(f"Simulation {j} failed.") | ||
| wait(5) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think |
||
| os.system(f"scenic -S {scenic_path} --count 1 --time {num_of_steps} --2d --param recordFolder {sim_dir} --param timeout 30") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the first attempt failed, why would you expect it to work a second time after a 5-second delay? Scenic already retries the simulation if there was a |
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
Comment on lines
+71
to
+76
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please remove unnecessary blank lines (1 is OK). |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all files, please separate imports into 3 sections separated by 1 blank line: Python stdlib imports, external packages (like
pandas), and local imports from within VerifAI.