cryoDRGN

Deep Reconstructing Generative Networks for cryo-EM heterogeneous reconstruction

For general information on job submission, please see here.

General information

  • cryoDRGN runs on extracted particle stacks that have undergone a 3D refinement. This ‘consensus refinement’ information is then used as input for cryoDRGN.
  • We have combined Step 1 to Step 6 as outlined on the cryoDRGN repo into a single submission step. This includes:
    1. Preprocess image stack
    2. Parse image poses
    3. Parse CTF parameters
    4. [We skip Step 4]
    5. Running cryoDRGN heterogeneous reconstructions
    6. Analysis of results
      • We perform a default analysis at this point using the last epoch as the input.
  • Recommended usage from cryoDRGN repo:
    • “It is recommended to first train on lower-resolution images (e.g. D=128) with --zdim 1 and with --zdim 10 using the default architecture (fast). After validation, pose optimization, and any necessary particle filtering, then train on the full resolution image stack (up to D=256) with a large architecture (slow).”
    • “Note: While these settings worked well for the datasets we’ve tested, they are highly experimental for the general case as different datasets have diverse sources of heterogeneity. Please reach out to the authors with questions/consult — we’d love to learn more.”

Input particle stack format:

RELION-extracted particle stacks. Click here to learn what this means.

*We only support RELION stacks at this moment*

Required input parameters:

  1. Consensus refinement STAR file
    • STAR file from RELION refinement uploaded using ‘Browser upload’ not Globus
  2. Box size for refined structure
    • Box size of structure determined from refinement
  3. Scaled-down box size
    • cryoDRGN will scale down the box size to save memory (and time!). Default is 128 pixels but you can only use 64 pixels.
  4. Pixel size of data
    • Provide pixel size of input data
  5. Dimension of latent variable (–zdim) (fast=1; slow=10)
    • More dimensions = longer training per epoch
  6. Number of epochs to use during training (-n)
    • Number of iterations used for training the VAE

Optional or Advanced input parameters:

  1. Advanced: Number of nodes in hidden layers for encoder (–qdim)
    • Encoder is the layer that is learning the data.
    • Default = 256; You can make the network deeper by increasing to 1024
  2. Advanced: Number of hidden layers for encoder (–qlayers)
    • Adding layers increases complexity
  3. Advanced: Number of nodes in hidden layers for decoder (–pdim)
    • Decoder is ‘interpreting’ the latent space
    • Default = 256; You can make the network deeper by increasing to 1024
  4. Advanced: Number of hidden layers for decoder (–players) 
    • Adding layers increases complexity
  5. Optional: Check box if STAR file is from RELION v. 3.1 
    • To enable STAR file reading from 3.1
  6. Optional: Invert contrast (needed if particles are white on black background)
    Default from RELION is white on black background

Monitoring job progress

While the job is running, you can watch the output file ‘stdout.txt’ to follow job progress. This can be found in the “Intermediate Results” file listing for your task.

Citation:

CryoDRGN: Reconstruction of heterogeneous structures from cryo-electron micrographs using neural networks. Ellen D. Zhong, Tristan Bepler, Bonnie Berger*, Joseph H. Davis*. bioRxiv

Software:

Github link

Questions:

Check out the Github repo and reach out to Ellen Zhong for more details on how to run cryoDRGN.

If you have specific requests for COSMIC² please email us at cosmic2support@umich.edu.