7 Best Practices To Ensure Consistency In Image Annotation

Computer vision models are as good—better in many cases—as humans in medical diagnosis using imaging tests such as X-rays and MRI. They may be able to identify, for example, cancerous tumors very early on and thus help treat them. This is possible thanks in no small measure to image annotation that helps computer vision systems identify patterns and make predictions.

Well-annotated datasets are, therefore, paramount if we want to create computer vision models that are accurate and reliable. In this article, we’ll outline some of the best image annotation practices to maintain consistency and completeness, which will ensure accuracy in the system or model.

Why consistency in annotation is critical

Image annotation is the bedrock of computer vision systems. Proper and consistent annotation enables the systems to understand and interpret images, in line that aligns with reality. Poor and inconsistent annotation, on the other hand, can befuddle the machine learning model, perpetuate errors, and cause it to drift, becoming progressively less accurate over time.

Consistent annotations across images facilitate the model in identifying patterns and relationships in the data, which helps it make more accurate predictions. Clarity and consistency in annotations reduce ambiguity, noise, and confusion. This minimizes the likelihood of misinterpretation of the data and ensures that the model learns and performs as intended.

Given that computer vision is key in various consequential applications like medical diagnostics and driverless cars, it is imperative that the systems are trained with good, rich, and unbiased data. A diverse sample of data with consistent annotation is requisite.

The question then is how to do that. This we’ll learn in the following section.

Best practices for maintaining image annotation consistency

There are several practices we can follow to improve image annotation and ensure consistency in the labeled datasets. Some of these help establish consistency & accuracy and enable models to learn efficiently and make accurate predictions; others allow humans to save time and resources; while others may help build trust and credibility. And yet some others help enhance interoperability between teams & systems and ensure that the annotated datasets remain relevant and reliable for future projects.

Here are some best practices that will help establish the above.

● Develop clear and comprehensive annotation guidelines

Because annotation is not a one-person job, the importance of developing (or having) clear and comprehensive annotation guidelines cannot be overemphasized. Guidelines serve as a blueprint for annotators that provide them with precise instructions and standards to adhere to. This ensures that all annotations are done consistently across the board.

Clear guidelines reduce the scope for misinterpretation that arises out of ambiguity or lack of instructions. This leaves less room for personal biases and subjective inferences to creep into the annotations and make them inconsistent. While human subjectivity will always remain, a very detailed rubric helps ensure that it conforms to an objective standard.

The guidelines need to be comprehensive but specific—and exhaustive, as far as possible. They must contain reference materials that serve as visual examples of correct or incorrect annotations. Ensure that they are understood clearly by the annotators, and leave ample room for them to raise questions and seek clarifications. And finally, document the changes. This will help ease the revision of older annotations and also ensure that annotators are using the latest version.

● Use a standard set of labels

Different annotators may label the same thing differently. To ensure consistency in labels, use a standard set of labels. Doing so reduces the confusion and ambiguity that can arise when annotators use preferential—but not necessarily preferable—or non-standard terms.

For example, in the case of annotating images for training autonomous vehicles, an annotator may refer to a person walking as a “pedestrian”; while another may be inclined to use the term “foot passenger.” Standardizing labels ensures that the terminologies used are consistent.

Adopt established standards as far as possible to make it easier for anybody to understand and so that your tags are consistent with external annotations. Provide a glossary with clear definitions & examples and keep the terminologies up to date.

● Handle ambiguities uniformly

In the real world that data is a representation of, ambiguities and muzziness abound. This is especially so in complex datasets or extraordinary scenarios. This opens up wide enough avenues for inconsistencies in annotations due to the subjectivity of annotators. When an object is partially obscured, for example, how should annotators deal with it? And if there are diverse interpretations of the same scenario, which one comes on top?

What can be done to ensure ambiguities are dealt with uniformly no matter the person annotating? A few crucial things:

Establish a clear protocol for dealing with uncertain situations to minimize individual interpretations from influencing the annotations.
Provide clear criteria for disambiguation. This may involve developing decision trees or a set of rules that annotators can follow.
Establish confidence scores and assign levels of certainty of annotations.
Have multiple annotators independently review and resolve ambiguous cases.
Record the ambiguities encountered and how they were resolved. Documentation helps other annotators and in dealing with similar issues in the future.

Handling ambiguities uniformly helps maintain consistency in annotations. This in turn leads to a more accurate and reliable annotated dataset.

● Have multiple annotators label the same images

Images have multiple facets that a single annotator may not see. Having multiple annotators label the same images introduces a diverse range of perspectives, enables comparison between different annotations, and makes the labels more thorough. This helps identify and rectify discrepancies and also adds more details to the data that individual annotators may overlook. This collaborative and consensus-building approach makes the annotations not only more consistent and accurate but also more detailed and contextual. It can also reduce personal and subjective biases invariably associated with human annotators.

To make the most of multiple annotators, let them annotate independently without the knowledge of other annotators’ labels. This will prevent replication of another annotator’s bias as well as allow maximum leeway for individual expression. Then have them review another’s annotations and consult experts to resolve discrepancies.

You may use metrics like Cohen’s kappa (for two annotators) or Fleiss’ kappa (for more than two annotators) to gauge the level of agreement between labelers. This will provide a measure of consistency and therefore accuracy & reliability of the annotations.

● Randomly sample annotated images for quality checks

Select annotated images at random and perform quality checks to identify and address potential inconsistencies and inaccuracies early on. Random samples will give you a good idea of areas prone to errors and conflict among different annotators.

It is important to be mindful of the sample size considered for quality assurance. A small sample size may allow you to quickly compare the samples but it will be less helpful in assisting you to detect anomalies and inconsistencies. A large sample, on the other hand, will increase the sampling cost, especially if the sampling technique is destructive, though it may give you an overarching overview.

Have clearly defined sampling criteria for selecting images as well as guidelines for quality checks. They can include the percentage of images to be sampled and categories to focus on. Further, it can also highlight the steps that need to be taken when inconsistencies and inaccuracies are identified in the samples.

● Establish an iterative feedback loop

An iterative feedback loop provides a mechanism for annotators to seek clarification, share insights, and address issues and questions they encounter. It allows and encourages communication between annotators & subject matter experts and fosters a culture of collaboration. This can have a tangible impact on the quality and consistency of the annotation.

Continuous feedback also ensures that annotators are aligned with the guidelines and adhere to the standards & conventions.

For this, you need to have clear communication channels and establish links between the annotators, the company, and domain experts. Encourage open communication and make it easy for annotators to raise and resolve their questions.

● Select the right annotators and means

You may have all the above practices in order but if you don’t have the right annotators, then they mean little. Annotators are, after all, the ones who are going to execute the practices. So, selecting the right annotators with technical proficiency specific to your project’s requirements is fundamental to maintaining accuracy and consistency in data annotations.

But who is the right annotator is a tricky question, and the answer varies depending on what kind you are seeking. If you are crowdsourcing, then the choice is limited to choosing the platforms. If you choose to do it in-house, then it is necessary to identify the proficiency in tools as well as familiarity with labeling ontology in each annotator and assign tasks accordingly.

However, either of these approaches has weaknesses in their own ways. Crowdsourcing is cheap but it is unreliable and difficult to scale; in-house annotations make it easy to enforce consistency and some standard of quality but it is costly and often sluggish.

Considering this, it may be advantageous to outsource image annotation to experts with domain-specific knowledge, requisite skills, and tools. Adequately equipped, they can offer image annotation services that outperform those provided by in-house teams or can be had with crowdsourcing.

Consistency is the key

Consistent data annotation is crucial if we want computer vision systems to learn efficiently and make accurate predictions. There is no workaround. There are no substitutes. But there are solutions.

The practices outlined here will help you maintain consistency in image annotations. Following them may not guarantee good results, but doing so will ensure high-quality and accurate annotation. A surer guarantee is to entrust the task to outsourcing companies that have the expertise and resources. This is the best option to generate accurate and consistent annotated images—and often the cheapest.

7 Best Practices To Ensure Consistency In Image Annotation

Why consistency in annotation is critical

Best practices for maintaining image annotation consistency