Usage Guide
Use Cases
Before getting into the library’s usage, we need to talk about when it makes sense to use it.
There are many ways in which regions can be salient in video. Dosage however, looks at color information only and processes every frame independently. This means that things like movement or changes in depth of field or camera focus generally won’t contribute to saliency.
Another point to keep in mind is that a decent distinction between foreground and background is key for the algorithm to perform its best. When scenes reach a too high point of chaos, its effectiveness goes down, and that’s when machine learning approaches could provide a significant edge.
Basic Usage
The library exposes a single member, the Dosage
class.
Every class instance is fully configured on creation, after which the run
method can be called to process any number of frames. There are no restrictions on the frames to be processed other than they need to match the width and height supplied to the instance when it was created.
from pydosage import Dosage
width = 800
height = 600
dosage = Dosage(width, height) # width and height are the only required arguments
saliency = dosage.run(frame) # frame is a (600, 800, 3) NDArray[np.uint8]
Important Note: When having deflicker
set to True
(the default), you should create a new Dosage
instance for each video you want to process. If you’re batch processing independent images, you should set deflicker
to False
.
Methods
The Dosage
class supports three different saliency detection methods via the method
parameter.
MethodDosage
: Judges how salient a pixel is by estimating the probability density function of one or more background regions.
MethodFastMBD
: Judges how salient a pixel is by determining how well connected it is to a background region. It’s a slightly modified version of the “Fast Minimum Barrier Distance” algorithm.
MethodHybrid
: A linear combination of the results outputted by the two previous methods. The better they perform, the better MethodHybrid
performs.
Which method should I use?
Generally:
MethodHybrid
outperforms both MethodDosage
and MethodFastMBD
.
MethodFastMBD
outperforms MethodDosage
.
With that said, all methods can outperform each other depending on the specific scenario. Above all, experiment and see what works best for you.
One benefit of MethodDosage
is that it’s way faster. So if it’s giving you good results and you want the speed boost, it may be a good choice.
Foreground Boundaries
As mentioned above, both MethodDosage
and MethodFastMBD
make use of a “background region”. Said region is chosen by leveraging a known spatial prior that states that an image’s background is likely to show up near its boundaries. This heuristic is of course not perfect, but works very well in a large number of scenarios.
There are cases were the user can help by declaring that one of the image’s boundaries doesn’t represent the background well. One typical scenario is that of a subject that occupies a large portion of a frame and is cut-off at the bottom.
The example below showcases the difference when using MethodFastMBD
and passing different values to the foreground_boundary
parameter. By declaring that the bottom boundary is a foreground boundary and does not represent the background well, the algorithm does a much better job.
Input
BoundaryNone
BoundaryBottom
It’s worth noting that MethodDosage
is not as succeptible to this issue, so when using it or when using MethodHybrid
, you may get good results without worrying too much about it.
Morphological Reconstruction
Morphological reconstruction is a process that can be useful when trying to both remove irrelevant features from images as well as completing shapes that are otherwise missing pieces.
Dosage supports morphological reconstruction via the reconstruct
parameter, albeit with an important caveat. Due to performance concerns, it runs for a fixed number of iterations (via the reconstruct_iter
parameter) instead of up to the point of convergence. It is then up to the user to fine tune it to achieve the desired result. To learn more about it and its related parameters, refer to the API.
By default, the Dosage
class instance performs a very small degree of reconstruction which helps remove tiny features and smooth out the result quite a bit. An example can be seen below.
Input
reconstruct = False
reconstruct = True
Postprocessing
The Dosage
class supports a postprocess
parameter. When True
, a sigmoid function is applied to the result. This can be useful to intensify the values of salient regions as well as to disregard regions that are not salient enough.
Input
postprocess = False
postprocess = True
You can control the intensity by tweaking sigmoid_strength
and the center (the salient vs non-salient cut-off point) with sigmoid_center
.