Layers

Anchor

The Anchor layer generates multi-aspect ratio and multiscale anchor bounding boxes with a corresponding category that classifies each anchor bounding box as an object or non-object.

Introduction

At each sliding-window location, the RCNN model simultaneously predicts multiple region proposals, where the number of maximum possible proposals for each location is denoted \(k\). The regression convolutional layer has 4,000 outputs encoding the coordinates of boxes, and the classification convolutional layer outputs 2,000 scores that estimate the probability of object or not object for each proposal. \(k\) proposals are parameterized relative to reference bounding boxes, which we call anchors. An anchor is centered at the sliding window in question, and is associated with a scale and aspect ratio.

(scale and aspect ratio figures)

By default, three scales and three aspect ratios are used, yielding \(k = 9\) anchors at each sliding window position. For a convolutional feature map of size \(rc\), there are \(rck\) anchors.

Translational symmetry

The RCNN model is translation invariant in terms of the anchors and the functions that compute object proposals relative to the anchors. If one translates an object in an image, the proposal should translate and the same function should be able to predict the proposal in either location. This translation-invariant property is guaranteed.

The translation-invariant property also reduces the model size. The model has a (4 + 2) × 9-dimensional convolutional output layer in the case of k = 9 anchors.

Multi-Scale Anchors as Regression References

The model addresses multiple scales and aspect ratios by building a pyramid of anchors. The model classifies and regresses bounding boxes with reference to anchor boxes of multiple scales and aspect ratios. It only relies on images and feature maps of a single scale, and uses sliding windows on the feature map of a single size.

Because of this multiscale architecture, we can use the convolutional features computed on a single-scale image. The design of multiscale anchors is a key component for sharing features without extra cost for addressing scales.

Implementation

By default, the AnchorTarget layer uses three scales with bounding box areas of \(128^{2}\), \(256^{2}\), and \(512^{2}\) pixels and three aspect ratios of 1:1, 1:2, and 2:1. The aspect ratios and scales hyperparameters were not carefully chosen for a particular dataset.

The model permits predictions larger than the underlying receptive field because such predictions are not impossible (one might still infer the extent of an object if only the middle of the object is visible).

The AnchorTarget layer ignores anchors that cross an image’s boundaries so they do not contribute to the loss.

For a typical 1,000 × 600 image, there will be roughly 20,000 (≈ 60 × 40 × 9) anchors in total. With the cross-boundary anchors ignored, there are about 6,000 anchors per image for training. If the boundary-crossing outliers are not ignored in training, they introduce large, difficult to correct error terms in the objective and training won’t converge. However, since the fully-convolutional region proposal network is applied to the entire image, cross-boundary object proposal bounding boxes might be generated but they are clipped to the image shape by the PropoalTarget layer.

class keras_rcnn.layers.Anchor(aspect_ratios=None, base_size=16, clobber_positives=False, negative_overlap=0.3, padding=0, positive_overlap=0.7, scales=None, stride=16, **kwargs)[source]
Attributes:
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, inputs, \*\*kwargs) This is where the layer’s logic lives.
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
get_losses_for  
get_updates_for  

ObjectDetection

class keras_rcnn.layers.ObjectDetection(padding=300, **kwargs)[source]
Attributes:
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, x[, training]) # Inputs metadata: image information (1, 3) deltas: predicted deltas (1, N, 4*classes) proposals: output of proposal target (1, N, 4) scores: score distributions (1, N, classes)
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
detections  
get_losses_for  
get_updates_for  
pad_bounding_boxes  

ObjectProposal

class keras_rcnn.layers.ObjectProposal(maximum_proposals=300, minimum_size=16, stride=16, **kwargs)[source]

Propose object-containing regions from anchors

# Arguments
maximum_proposals: maximum number of regions allowed min_size: minimum width/height of proposals stride: stride size
# Input shape
(width of feature map, height of feature map, scale), (None, 4), (None)
# Output shape
(# images, # proposals, 4)
Attributes:
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, inputs, \*\*kwargs) image_shape_and_scale has the shape [width, height, scale]
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
get_losses_for  
get_updates_for  

ProposalTarget

class keras_rcnn.layers.ProposalTarget(foreground=0.5, foreground_threshold=(0.5, 1.0), background_threshold=(0.1, 0.5), maximum_proposals=32, **kwargs)[source]

# Arguments fg_fraction: percent foreground objects

batchsize: number of objects in a batch

num_images: number of images to consider per batch (set to 1 for the time being)

num_classes: number of classes (object+background)

# Input shape (None, None, 4), (None, None, classes), (None, None, 4)

# Output shape [(None, None, 4), (None, None, classes), (None, None, 4)]

Attributes:
batch_size
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, inputs[, training]) This is where the layer’s logic lives.
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_bbox_regression_labels(self, …) Bounding-box regression targets (bbox_target_data) are stored in a form N x (tx, ty, tw, th), labels N This function expands those targets into the 4-of-4*K representation used by the network (i.e.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
sample(self, proposals, true_bounding_boxes, …) Generate a random sample of RoIs comprising foreground and background examples.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
find_foreground_and_background_proposal_indices  
get_bbox_targets  
get_losses_for  
get_updates_for  
sample_indices  
set_label_background  

RCNN

class keras_rcnn.layers.RCNN(**kwargs)[source]
Attributes:
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, inputs, \*\*kwargs) This is where the layer’s logic lives.
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
classification_loss  
get_losses_for  
get_updates_for  
regression_loss  

RegionOfInterest

class keras_rcnn.layers.RegionOfInterest(extent=(7, 7), strides=1, **kwargs)[source]

ROI pooling layer proposed in Mask R-CNN (Kaiming He et. al.).

Parameters:
  • size – Fixed size [h, w], e.g. [7, 7], for the output slices.
  • stride – Integer, pooling stride.
Returns:

slices: 5D Tensor (number of regions, slice_height,

slice_width, channels)

Attributes:
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, x, \*\*kwargs)
rtype:(samples, proposals, width, height, channels)
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
get_losses_for  
get_updates_for  

RPN

class keras_rcnn.layers.RPN(**kwargs)[source]
Attributes:
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, inputs, \*\*kwargs) This is where the layer’s logic lives.
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
classification_loss  
get_losses_for  
get_updates_for  
regression_loss  

Upsample

class keras_rcnn.layers.Upsample(**kwargs)[source]
Attributes:
built
input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape tuple(s) of a layer.

losses
non_trainable_weights
output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape tuple(s) of a layer.

trainable_weights
updates
weights

Methods

__call__(self, inputs, \*\*kwargs) Wrapper around self.call(), for handling internal references.
add_loss(self, losses[, inputs]) Adds losses to the layer.
add_update(self, updates[, inputs]) Adds updates to the layer.
add_weight(self, name, shape[, dtype, …]) Adds a weight variable to the layer.
assert_input_compatibility(self, inputs) Checks compatibility between the layer and provided inputs.
build(self, input_shape) Creates the layer weights.
call(self, inputs, \*\*kwargs) This is where the layer’s logic lives.
compute_mask(self, inputs[, mask]) Computes an output mask tensor.
compute_output_shape(self, input_shape) Computes the output shape of the layer.
count_params(self) Counts the total number of scalars composing the weights.
from_config(config) Creates a layer from its config.
get_config(self) Returns the config of the layer.
get_input_at(self, node_index) Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(self, node_index) Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(self, node_index) Retrieves the input shape(s) of a layer at a given node.
get_output_at(self, node_index) Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(self, node_index) Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(self, node_index) Retrieves the output shape(s) of a layer at a given node.
get_weights(self) Returns the current weights of the layer.
set_weights(self, weights) Sets the weights of the layer, from Numpy arrays.
get_losses_for  
get_updates_for