dscript.commands

dscript.commands.predict

See Prediction for full usage details.

Make new predictions with a pre-trained model. One of –seqs or –embeddings is required.

dscript.commands.embed

See Embedding for full usage details.

Generate new embeddings using pre-trained language model.

dscript.commands.train

See Training for full usage details.

Train a new model.

dscript.commands.train.interaction_eval(model, test_iterator, tensors, use_cuda)[source]

Evaluate test data set performance.

Parameters
  • model (dscript.models.interaction.ModelInteraction) – Model to be trained

  • test_iterator (torch.utils.data.DataLoader) – Test data iterator

  • tensors (dict[str, torch.Tensor]) – Dictionary of protein names to embeddings

  • use_cuda (bool) – Whether to use GPU

Returns

(Loss, number correct, mean square error, precision, recall, F1 Score, AUPR)

Return type

(torch.Tensor, int, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor)

dscript.commands.train.interaction_grad(model, n0, n1, y, tensors, accuracy_weight=0.35, run_tt=False, glider_weight=0, glider_map=None, glider_mat=None, use_cuda=True)[source]

Compute gradient and backpropagate loss for a batch.

Parameters
  • model (dscript.models.interaction.ModelInteraction) – Model to be trained

  • n0 (list[str]) – First protein names

  • n1 (list[str]) – Second protein names

  • y (torch.Tensor) – Interaction labels

  • tensors (dict[str, torch.Tensor]) – Dictionary of protein names to embeddings

  • accuracy_weight (float) – Weight on the accuracy objective. Representation loss is \(1 - \text{accuracy_weight}\).

  • run_tt (bool) – Use GLIDE top-down supervision

  • glider_weight (float) – Weight on the GLIDE objective loss. Accuracy loss is \((\text{GLIDER_BCE}*\text{glider_weight}) + (\text{D-SCRIPT_BCE}*(1-\text{glider_weight}))\).

  • glider_map (dict[str, int]) – Map from protein identifier to index

  • glider_mat (np.ndarray) – Matrix with pairwise GLIDE scores

  • use_cuda (bool) – Whether to use GPU

Returns

(Loss, number correct, mean square error, batch size)

Return type

(torch.Tensor, int, torch.Tensor, int)

dscript.commands.train.predict_cmap_interaction(model, n0, n1, tensors, use_cuda)[source]

Predict whether a list of protein pairs will interact, as well as their contact map.

Parameters
  • model (dscript.models.interaction.ModelInteraction) – Model to be trained

  • n0 (list[str]) – First protein names

  • n1 (list[str]) – Second protein names

  • tensors (dict[str, torch.Tensor]) – Dictionary of protein names to embeddings

  • use_cuda (bool) – Whether to use GPU

dscript.commands.train.predict_interaction(model, n0, n1, tensors, use_cuda)[source]

Predict whether a list of protein pairs will interact.

Parameters
  • model (dscript.models.interaction.ModelInteraction) – Model to be trained

  • n0 (list[str]) – First protein names

  • n1 (list[str]) – Second protein names

  • tensors (dict[str, torch.Tensor]) – Dictionary of protein names to embeddings

  • use_cuda (bool) – Whether to use GPU

dscript.commands.train.train_model(args, output)[source]

dscript.commands.evaluate

See Evaluation for full usage details.

Evaluate a trained model.

dscript.commands.evaluate.plot_eval_predictions(labels, predictions, path='figure')[source]

Plot histogram of positive and negative predictions, precision-recall curve, and receiver operating characteristic curve.

Parameters
  • y (np.ndarray) – Labels

  • phat (np.ndarray) – Predicted probabilities

  • path (str) – File prefix for plots to be saved to [default: figure]