CollaborativeCoding.models.solveig_model

Classes

SolveigModel

A Convolutional Neural Network (CNN) model for classification.

Functions

find_fc_input_shape(image_shape, model)

Finds the shape of the input to the fully connected layer after passing through the convolutional layers.

Module Contents

CollaborativeCoding.models.solveig_model.find_fc_input_shape(image_shape, model)

Finds the shape of the input to the fully connected layer after passing through the convolutional layers.

This function takes an input image shape and the model’s convolutional layers and computes the number of features passed into the first fully connected layer after the image has been processed through the convolutional layers.

Code inspired by @Seilmast (https://github.com/SFI-Visual-Intelligence/Collaborative-Coding-Exam/issues/67#issuecomment-2651212254).

Args

image_shapetuple(int, int, int): Shape of the input image (C, H, W), where C is the number of channels, H is the height, and W is the width of the image. This shape defines the input image dimensions.
modelnn.Module: The CNN model containing the convolutional layers. This model is used to pass the image through its layers to determine the output size, which is used to calculate the number of input features for the fully connected layer.

Returns

int: The number of elements in the input to the fully connected layer after the image has passed through the convolutional layers. This value is used to initialize the size of the fully connected layer.

class CollaborativeCoding.models.solveig_model.SolveigModel(image_shape, num_classes)

Bases: torch.nn.Module

A Convolutional Neural Network (CNN) model for classification.

This model is designed for image classification tasks. It contains three convolutional blocks followed by a fully connected layer to make class predictions.

Args

image_shapetuple(int, int, int): Shape of the input image (C, H, W), where C is the number of channels, H is the height, and W is the width of the image. This parameter defines the input shape of the image that will be passed through the network.
num_classesint: The number of output classes for classification. This defines the size of the output layer (i.e., the number of units in the final fully connected layer).

Attributes

conv_block1nn.Sequential: The first convolutional block consisting of a convolutional layer, ReLU activation, and max-pooling.
conv_block2nn.Sequential: The second convolutional block consisting of a convolutional layer and ReLU activation.
conv_block3nn.Sequential: The third convolutional block consisting of a convolutional layer and ReLU activation.
fc1nn.Linear: The fully connected layer that takes the output from the convolutional blocks and outputs the final classification logits (raw scores for each class).

Methods

forward(x): Defines the forward pass of the network, which passes the input through the convolutional layers followed by the fully connected layer to produce class logits.

conv_block1

conv_block2

conv_block3

fc1

forward(x)

Defines the forward pass of the network.

Args

xtorch.Tensor: A 4D tensor with shape (Batch Size, Channels, Height, Width) representing the input images.

Returns

torch.Tensor: A 2D tensor of shape (Batch Size, num_classes) containing the logits (raw class scores) for each input image in the batch. These logits can be passed through a softmax function for probability values.