Which of the following sampling techniques could lead to an underrepresentation of minority classes?

Get ready for the CertNexus Certified Data Science Practitioner Test. Practice with flashcards and multiple choice questions, each question has hints and explanations. Excel in your exam!

Random under sampling is a technique that involves reducing the size of the majority class in order to create a balanced dataset. This can lead to an underrepresentation of minority classes because by removing instances from the majority class, there's a risk of discarding valuable information from the dataset. If the minority class is somehow smaller to begin with, under sampling may exacerbate this issue, making it harder for the model to learn the characteristics of the minority class effectively.

In contrast, other techniques like random oversampling involve increasing instances of the minority class, SMOTE generates synthetic samples for the minority class, and bootstrap sampling focuses on creating multiple samples from the dataset to improve model robustness. Each of these methods has its own approach to balancing class distribution, while random under sampling directly reduces the amount of data available from the majority class and can overshadow the representation of minority classes.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy