CollaborativeCoding.models.solveig_model

Classes

SolveigModel

A Convolutional Neural Network (CNN) model for classification.

Functions

find_fc_input_shape(image_shape, model)

Finds the shape of the input to the fully connected layer after passing through the convolutional layers.

Module Contents

CollaborativeCoding.models.solveig_model.find_fc_input_shape(image_shape, model)

Finds the shape of the input to the fully connected layer after passing through the convolutional layers.

This function takes an input image shape and the model’s convolutional layers and computes the number of features passed into the first fully connected layer after the image has been processed through the convolutional layers.

Code inspired by @Seilmast (https://github.com/SFI-Visual-Intelligence/Collaborative-Coding-Exam/issues/67#issuecomment-2651212254).

Args

image_shapetuple(int, int, int)

Shape of the input image (C, H, W), where C is the number of channels, H is the height, and W is the width of the image. This shape defines the input image dimensions.

modelnn.Module

The CNN model containing the convolutional layers. This model is used to pass the image through its layers to determine the output size, which is used to calculate the number of input features for the fully connected layer.

Returns

int

The number of elements in the input to the fully connected layer after the image has passed through the convolutional layers. This value is used to initialize the size of the fully connected layer.

class CollaborativeCoding.models.solveig_model.SolveigModel(image_shape, num_classes)

Bases: torch.nn.Module

A Convolutional Neural Network (CNN) model for classification.

This model is designed for image classification tasks. It contains three convolutional blocks followed by a fully connected layer to make class predictions.

Args

image_shapetuple(int, int, int)

Shape of the input image (C, H, W), where C is the number of channels, H is the height, and W is the width of the image. This parameter defines the input shape of the image that will be passed through the network.

num_classesint

The number of output classes for classification. This defines the size of the output layer (i.e., the number of units in the final fully connected layer).

Attributes

conv_block1nn.Sequential

The first convolutional block consisting of a convolutional layer, ReLU activation, and max-pooling.

conv_block2nn.Sequential

The second convolutional block consisting of a convolutional layer and ReLU activation.

conv_block3nn.Sequential

The third convolutional block consisting of a convolutional layer and ReLU activation.

fc1nn.Linear

The fully connected layer that takes the output from the convolutional blocks and outputs the final classification logits (raw scores for each class).

Methods

forward(x)

Defines the forward pass of the network, which passes the input through the convolutional layers followed by the fully connected layer to produce class logits.

conv_block1
conv_block2
conv_block3
fc1
forward(x)

Defines the forward pass of the network.

Args

xtorch.Tensor

A 4D tensor with shape (Batch Size, Channels, Height, Width) representing the input images.

Returns

torch.Tensor

A 2D tensor of shape (Batch Size, num_classes) containing the logits (raw class scores) for each input image in the batch. These logits can be passed through a softmax function for probability values.