Advices for Image resolution for dataset trainings


t really depends on the size of your network and your GPU. You need to fit reasonably sized batch (16-64 images) in gpu memory. That can easily be very big: you can compute the size of intermediate activations as 4*batch_size*num_feature_maps*height*width. Say you take 32 square images 112×112 with 64 feature maps. It would be 100Mb just for activations and the same amount for gradients. Take relatively big network (for example, VGG16) and you already need a few Gb.

Other aspect is the size of receptive field. If you follow current advices to prefer small filter size (3×3) and take big images, you can end up either with quite shallow network (because you can’t fit a lot of layers into gpu) or with narrow network (which is ok if you know how to train it). Former network will necessarily have small effective receptive fields, therefore will approximate more local and simpler function.

So the rule of thumb is use images about 256×256 for ImageNet-scale networks and about 96×96 for something smaller and easier. I have heard that in kaggle people train on 512×512 sometimes, but you will need to compromise on something. Or just buy gpu cluster.

If you train fully convolutional networks like Faster RCNN you can take much bigger images (say 800×600) because you have batch size = 1.

(2) Changing the batch and image size during training: Part of the reason why many research papers are able to report the use of such large batch sizes is that many standard research datasets have images that aren’t very big. When training networks on ImageNet for example, most state-of-the-art network used crops between 200 and 350; of course they can have large batches with such small image sizes! In practice, due to current camera technology, most of the time we are working with images that are 1080p or at least not too far off from it.

To get around this small bump in the road, you can start off your training with smaller images and larger batch size. Do this by downsampling your training images. You’ll then be able to fit many more of them into one batch. With the large batch size + small images you should be able to already get some decent results. To complete the training of your network, fine tune it with a smaller learning rate and large images with a smaller batch size. This will get the network to re-adapt to the higher resolution and the lower learning rate keeps the network from jumping away from the good minimum found from the large batch. As a result, your network is able to get to a good minimum from the large batch training and works well on your high-resolution images from the fine tuning.

Large Data Sets Cause the Model Size to Explode

Most research papers and consumer use cases tend to use low resolution images for training deep learning models; often as small as 256×256 pixels. In fact, many resize the ImageNet data set images down to this resolution.

Several enterprise use cases, however, require the use of high resolution images. For example, when working with medical images, resizing them to lower resolution can mean that what was a cancer lesion now becomes a dot on a smaller image.

The challenge in keeping the images large is that the deep learning model size explodes. If you are operating on an image that is 2,000 by 2,000 pixels for example, each layer can have millions of nodes.

Data Preparation. A fixed size must be selected for input images, and all images must be resized to that shape. The most common type of pixel scaling involves centering pixel values per-channel, perhaps followed by some type of normalization.

a single neural network can only work with standardly-sized images; too-small images must be scaled up and too-large images must be scaled down. But what image size should be pick?

If your goal is model accuracy, larger is obviously better. But there is a lot of advantage to starting small.

The best way to deal with different sized images is to downscale them to match dimensions from the smallest image available.

– how to prepare dataset with labelimg    

– raccoon dataset

Добавить комментарий