Machine learning model is what you use to get predictions. Model is the entity you train with past data to give answers about unseen future samples.
Confidence must be able to be calculated at prediction time. Without some form of confidence, most predictions are useless.
Bad model prediction: Cat or Dog Good model prediction: Cat (82%), Dog (5%)
Training dropout is good for dense features. Dropout works better for dense features (e.g. vision) than for sparse features (e.g. topic-based). So, a lot of zeros means that don't use dropout and little zero feature values means to apply some dropout.
A convolution maps a region of an image to a feature map. Also known as a local neural network feature detector.
For example a 5x5 array of pixels could be mapped to oriented edge features.
The convolution is repeated over-and-over at various pixel locations in what is sometimes called a "sliding window" or "sliding rectangle".
How much window slides is called step size or stride.
Windows can overlap. Overlapping is frequently even preferred.
Convolution creates low-level building blocks for higher-level features. The next level up in the network wants to detect higher-level features from the low-level building blocks.
detecting a corner from two intersecting edges
Pooling layer is usually applied after convolution. Pooling is used to detect variations on the same structural theme.
For example, the outputs of adjacent vertical edge detectors will be "max pooled" to a unit that says "there is a vertical edge in the vicinity of this location".