Does convolution share weight in the same batch

edream_zhao 注册会员
2023-02-27 13:26

This answer quotes ChatGPT

Yes, in the convolutional neural network(CNN), the weights within the same convolutional kernel are shared, whether in the same feature graph or between different feature graphs. This is because the convolution kernel uses the same weight when convolving each position to extract similar features. Therefore, when convolution operations are performed with the same convolution kernel in different feature graphs of the same batch, the weight of the convolution kernel is shared to improve the efficiency and generalization ability of the model.

In the feature graph stack you mentioned, suppose the convolution kernel size of the convolution operation is(Cout, Cin, kh, kw), where Cout and Cin are the number of channels of output and input respectively, and kh and kw are the height and width of the convolution kernel respectively. Then, each feature graph in the stack will be convolved with the same convolution kernel and therefore share the same weight between them.

deoqmgef 注册会员
2023-02-27 13:26

the reference answer GPT ᴼ ᴾ ᴱ ᴺ ᴬ ᴵ < br / > in convolution neural network, each convolution layer usually have more than one filter(also known as convolution kernels or weight), Each filter is responsible for extracting different features from the input feature map. During the convolution operation, each filter slides over the input feature map to compute the weighted sum of each position and generate the output feature map.

For a given convolution layer, all filters use the same weight for convolution operation, so when the stack of feature graphs is(4,3, h, w), the four feature graphs share weight with each other. That is, for each filter, it evaluates the same on each input feature graph.

Shared weight is an advantage of convolutional neural networks, which can greatly reduce the number of parameters to be learned and enhance the generalization ability of the network to a certain extent. Because each filter performs the same convolution operation on all input feature graphs, the network can learn features on a wider set of data and process new input data better.