Skip to content

Latest commit

 

History

History
18 lines (17 loc) · 1.4 KB

dl_11.md

File metadata and controls

18 lines (17 loc) · 1.4 KB

Dynamic routing between capsules

Sara Sabour, Nicholas Frosst, Geoffrey E. Hinton (2017)

Key points

  • CNN: good at detecting features + dealing with translation
    • Less good at exploring spatial relationships between features (size, perspective, orientation) + other affine transformations
    • May be fooled by a "Picasso face"
  • Solution: capsules: represent features by vectors that also include e.g. orientation and size next to likelihood
    • Activity vector: instantiation parameters (pose, velocity, etc.)
      • Length: probability that the object exists (max 1)
      • Orientation: represents the instantiation parameters
    • Better generalization: no separate neurons needed for differently oriented objects (as in CNN) --> number doesn't grow exponentially for more dimensions!
    • Also: max pooling: lot of information lost, while capsules keep weighted sum of last layer --> better at dealing with overlap
  • Dynamic routing-by-agreement: top-down feedback whether or not the input is useful (based on how closely related)
    • Backprop still used for training --> slow!
  • Capsules are good for dealing with segmentation, due to routing-by-agreement
  • Capsules are equivariant to viewpoint, instead of trying to eliminate viewpoint --> deal with multiple different transformations at the same time
  • However: capsules can only deal with 1 instance of a class at a specific location in image (crowding)