Sunday, May 23

Meta-cognitive judgements of change detection predict change blindness

Adam Barnas (, Emily Ward

In general, people tend to think that they will notice large changes made to visual scenes (Levin et al., 2000) and this collective overconfidence is part of what makes the phenomenon of change blindness so surprising. Yet, despite their overconfidence, are individuals in fact aware of and able to assess the relative difficulty of changes? We investigated whether participants’ judgments of their ability to detect changes predicted their own change blindness duration. First, participants (N = 219) recruited from Amazon Mechanical Turk completed a standard change blindness task consisting of 30 scenes that cycled between an unmodified and modified version of the image. Participants pressed a button when they noticed the change, providing a measure of change blindness. After 6 to 7 months had passed, we re-contacted the same participants and showed them the same 30 scenes, now with the unmodified and modified versions presented side-by-side and a bounding box highlighting the change. Participants rated how likely they would be to spot the change using a 5-point Likert scale. We found that participants’ ratings of change detection significantly predicted their change blindness duration for each image, such that changes rated as likely to be spotted were detected faster than changes rated as unlikely to be spotted (p < .001). These ratings continued to be predictive when accounting for the eccentricity and size of the change (p < .001). However, there was no advantage to using participants’ own ratings of change detection ability compared to the ratings from an independent group to predict change blindness duration, suggesting that differences among images (rather among individuals) contribute the most to change blindness. Together, these findings indicate that instead of having indiscriminate overconfidence, people are aware of the relative difficulty of changes and their meta-cognitive judgements of change detection ability accurately predict change blindness.

Poster: Sunday, May 23, 7:00 – 9:00 am CDT, Manatee

[VSS Link]

Learning to identify visual signals of intentionality

Mohan Ji (, Emily Ward, C. Shawn Green

The human visual system not only provides information about the physical state of the environment, but it also provides information about the causal structures that underlie it. One example is our ability to perceive animacy and intentionality. Even when viewing displays of simple shapes on computer screens, we tend to interpret certain cues (e.g., self-propulsion) as being strongly indicative of animacy or intentionality. One of the tasks that involve strong cues of animacy and intentionality is predatory chasing behavior. Here we employed an experimental setup that tests participants’ ability to detect and identify visual signals of intentionality using a chasing task in a noisy environment. Participants viewed videos where one red dot (“sheep”) is trying to escape from one chasing white dot (“wolf”) that is hiding among 19 other randomly moving white dots. The participants’ task was to detect and identify which of the 20 white dots was the wolf. The videos themselves were generated using data from another group of participants, who actually either played as the sheep against a computer-controlled wolf or played against each other, one human as sheep and one as wolf. The videos were further categorized based on whether the sheep was eventually caught or not. We found that when participants viewed these videos, they were more accurate at detecting the wolf in human vs. computer trials than in human vs. human trials. We also found that participants’ detection accuracy varied significantly across trials even within the same condition. For each video we quantified the extent to which the wolf showed “direct chasing” behavior. We found that the more direct chasing behaviors in a trial, the easier it was for participants to identify the wolf. Our results show that participants could identify intentional agents in noisy environments based on certain behaviors utilized by the agents.

Poster: Sunday, May 23, 3:00 – 5:00 pm CDT, Egret

[VSS Link]

Tuesday, May 25

Visual ensemble representations in Deep Neural Networks trained for natural object recognition

Siddharth Suresh (, Emily J. Ward

Humans can quickly pool information from across many individual objects to perceive ensemble properties, like the average size or color diversity of objects. Such ensemble perception in humans is thought to occur extremely efficiently and automatically, but how it arises in the first place is unknown. Does ensemble perception arise because the visual system must solve many different types of perceptual problems, or are ensemble properties represented even in a system with the sole goal of recognizing individual objects? We used an artificial visual system—a deep neural network (DNN)—to determine whether the ensemble properties of average size and color diversity were present in a network pre-trained to recognize only individual natural objects. We presented the network with new images that were completely different from its training set: images of white circles of different sizes (randomly chosen from a specified range) or letter arrays containing four colored consonants with each letter drawn either from a broad sample of 19 colors (high diversity) or a randomly selected range of six adjacent colors (low diversity). Therefore, the ensemble properties of interest were a summary statistic for the whole image and not recoverable from any individual element. We tested whether a ResNet50 neural network could predict the average size or distinguish high vs low color diversity arrays by using the activations from different layers as input to a linear regressor and a linear classifier (SVM). We found that the network activations were highly accurate at predicting the average size and identifying the color diversity, even at the earlier layers in the network. In contrast, information about individual object features (object size) increased in the deeper layers. This demonstrates that artificial visual systems trained to only recognize individual objects also extract ensemble properties of multiple objects extremely early in visual processing.

Talk: Tuesday, May 25, 9:30 – 11:00 am CDT, Talk Room 1

[VSS Link]

Wednesday, May 26

To Each Their Own: Measuring Familiarity for Face Images

Y. Ivette Colón (, Emily J. Ward

Familiarity for never-before-seen faces is a phenomenon tied to both visual perception and personal experience (Lyon, 1996). Can face images be intrinsically familiar? If so, can familiarity be measured consistently? We obtained three measures of familiarity for 100 hyper-realistic, GAN-generated faces, and examined the correspondence in responses among participants and among tasks. Our first task captured memorability (accurate recognition of something previously seen; recognition hit rate, Bainbridge et al., 2016) and familiarity (false recognition of something not previously seen; false alarm rate). However, false alarm-based quantification of familiarity alone is likely more conservative than our typical experiences of familiarity. Therefore, in a second task, we measured familiarity using an untimed forced-choice task in which a new group of participants chose the “more familiar” face in random pairs of faces. The resulting score for each face across participants serves as its familiarity score. Finally, in a third task, we aimed to capture the subjective nature of familiarity for individual faces by having a third group of participants rate faces on a sliding scale between “Not at all familiar” and “Extremely familiar”. To establish the reliability among participants in their familiarity judgements, we computed Kendall ranked correlations between image rankings (by familiarity score) for 100 split-halves of the data for each task. We found widespread variability in image rankings (Exp.1 mean tau=0.07, Exp.2=0.01, Exp.3=0.04). We calculated the consistency of participant responses relative to population responses using logistic regression to predict familiarity scores and found varying levels of agreement by participant. Finally, we computed a Kendall correlation for image rankings between tasks and found no significant correlation. The lack of correspondence in responses among participants and tasks suggests that “familiarity” is likely not an intrinsic property of faces and that experimental measures may fail to capture our everyday experience of face familiarity consistently.

Talk: Wednesday, May 26, 12:00 – 1:30 pm CDT, Talk Room 2

[VSS Link]