As the project grew larger, adding new features directly to the original code truly introduced more and more bugs. Today I discovered two similar bugs within a single day—truly annoying!
In the originally defined model, nn.Conv2d's kernelsize and stride were one-to-one corresponding. However, in the extended model, one layer did not follow the previous correspondence rule, causing the code that assigned stride based on kernelsize to fail outright.
If it weren't for today's sudden addition of an assert to check shape, I might never have found this bug in my lifetime.
if kernel_size == 3:
padding = 1
stride = 1
elif kernel_size == 5:
padding = 2
stride = 2
else:
raise ValueError()
Previously, every convolutional layer in the model was followed by an nn.ReLU. But to adapt to new resolution requirements, a transformation layer was designed that does not have nn.ReLU after it. Its naming convention is identical to that of earlier convolutions. Meanwhile, the existing logic used during simulation failed immediately.
o( ̄┰ ̄*)ゞ
-if quant.startswith('classifier_conv')
+if quant.startswith('classifier_conv') and quant != 'classifier_conv_transfer':
x[x < 0] = 0
In short, both bugs today stemmed from minor additions made during migration from the original code, resulting in flawed condition checks. These issues seem unavoidable unless a highly extensible architecture is designed from the beginning. But how could one anticipate future problems at the outset? Doing research isn't like building a product—new ideas emerge frequently, making it even less feasible to plan so far ahead.